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POLYKETIDE SYNTHASE ENZYMES AND RECOMBINANT DNA 
CONSTRUCTS THEREFOR 



5 Cross-Reference to Related Applications 

The present application claims priority to related U.S. patent application Serial 
Nos. 60/102,748, filed 2 Oct 1998; 60/139,650, filed 17 June 1999; and 6*0/123,810, filed 
1 1 Mar. 1 999, each of which is incorporated herein by reference. 



10 Field of the Invention 

The present invention relates to polyketides and the polyketide synthase (PKS) 
enzymes that produce them. The invention also relates generally to genes encoding PKS 
enzymes and to recombinant host cells containing such genes and in which expression of 
such genes leads to the production of polyketides. The present invention also relates to 

15 compounds useful as medicaments having immunosuppressive and/or neurotrophic 
activity. Thus, the invention relates to the fields of chemistry, molecular biology, and 
agricultural, medical, and veterinary technology. 



Background of the Invention 

20 Polyketides are a class of compounds synthesized from 2-carbon units through a 

series of condensations and subsequent modifications. Polyketides occur in many types of 
organisms, including fungi and mycelial bacteria, in particular, the actinomycetes. 
Polyketides are biologically active molecules with a wide variety of structures, and the 
class encompasses numerous compounds with diverse activities. Tetracycline, 

25 erythromycin, epothilone, FK-506, FK-520, narbomycin, picromycin, rapamycin, 
spinocyn, and tylosin are examples of polyketides. Given the difficulty in producing 
polyketide compounds by traditional chemical methodology, and the typically low 
production of polyketides in wild-type cells, there has been considerable interest in 
finding improved or alternate means to produce polyketide compounds. 
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This interest has resulted in the cloning, analysis, and manipulation by 
recombinant DNA technology of genes that encode PKS enzymes. The resulting ~^ \ 
technology allows one to manipulate a known PKS gene cluster either to produce the 
polyketide synthesized by that PKS at higher levels than occur in nature or in hosts that 
5 otherwise do not produce the polyketide. The technology also allows one to produce 

molecules that are structurally related to, but distinct from, the polyketides produced from 
known PKS gene clusters. See, e.g., PCT publication Nos. WO 93/13663; 95/08548; 
96/40968; 97/02358; 98/27203; and 98/49315; United States Patent Nos. 4,874,748; 
5,063,155; 5,098,837; 5,149,639; 5,672,491; 5,712,146; 5,830,750; and 5,843,718; and 

10 Fu et al, 1994, Biochemistry 33: 9321-9326; McDaniel et a/., 1993, Science 262: 1546- 
1550; and Rohr, 1995, Angew. Chem. Int. Ed. Engl 34{%): 881-888, each of which is 
incorporated herein by reference. 

Polyketides are synthesized in nature by .PKS enzymes. These enzymes, which are 
complexes of multiple large proteins, are similar to the synthases that catalyze 

1 5 condensation of 2-carbon units in the biosynthesis of fatty acids. PKSs catalyze the 
biosynthesis of polyketides through repeated, decarboxylative Claisen condensations 
between acylthioester building blocks. The building blocks used to form complex 
polyketides are typically acylthioesters, such as acetyl, butyryl, propionyl, malonyl, 
hydroxymalonyl, methylmalonyl, and ethylmalonyl CoA. Other building blocks include 

20 amino acid like acylthioesters. PKS enzymes that incorporate such building blocks 
include an activity that functions as an amino acid ligase (an AMP ligase) or as a non- 
ribosomal peptide synthetase (NRPS). Two major types of PKS enzymes are known; 
these differ in their composition and mode of synthesis of the polyketide synthesized. 
These two major types of PKS enzymes are commonly referred to as Type I or "modular" 

25 and Type II "iterative" PKS enzymes. 

In the Type I or modular PKS enzyme group, a set of separate catalytic active 
sites (each active site is termed a "domain", and a set thereof is termed a "module") exists 
for each cycle of carbon chain elongation and modification in the polyketide synthesis 
pathway. The typical modular PKS is composed of several large polypeptides, which can 

30 be segregated from amino to carboxy termini into a loading module, multiple extender 
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modules, and a releasing (or thioesterase) domain. The PKS enzyme known as 6- 
deoxyerythronolide B synthase (DEBS) is a Type I PKS. In DEBS, there is a loading 
module, six extender modules, and a thioesterase (TE) domain. The loading module, six 
extender modules, and TE of DEBS are present on three separate proteins (designated 
5 DEBS-1, DEBS-2, and DEBS-3, with two extender modules per protein). Each of the 
DEBS polypeptides is encoded by a separate open reading frame (ORF) or gene; these 
genes are known as eryAI, eryAII, and eryAIIL See Caffrey et aL, 1992, FEBS Letters 
304: 205, and U.S. Patent No. 5,824,513, each of which is incorporated herein by 
reference. 

1 0 Generally, the loading module is responsible for binding the first building block 

used to synthesize the polyketide and transferring it to the first extender module. The 
loading module of DEBS consists of an acyltransferase (AT) domain and an acyl carrier 
protein (ACP) domain. Another type of loading module utilizes an inactivated 
ketosynthase (KS) domain and AT and ACP domains. This inactivated KS is in some 

1 5 instances called KS Q , where the superscript letter is the abbreviation for the amino acid, 
glutamine, that is present instead of the active site cysteine required for ketosynthase 
activity. In other PKS enzymes, including the FK-506 PKS, the loading module 
incorporates an unusual starter unit and is composed of a CoA ligase like activity'domain. 
In any event, the loading module recognizes a particular acyl-CoA (usually acetyl or 

20 propionyl but sometimes butyryl or other acyl-CoA) and transfers it as a thiol ester to the 
ACP of the loading module. 

The AT on each of the extender modules recognizes a particular extender-Co A 
(malonyl or alpha-substituted malonyl, i.e., methylmalonyl, ethylmalonyl, and 2- 
hydroxymalonyl) and transfers it to the ACP of that extender module to form a thioester. 

25 Each extender module is responsible for accepting a compound from a prior module, 
binding a building block, attaching the building block to the compound from the prior 
module, optionally performing one or more additional functions, and transferring the 
resulting compound to the next module. 

Each extender module of a modular PKS contains a KS, AT, ACP, and zero, one, 

30 two, or three domains that modify the beta-carbon of the growing polyketide chain. A 
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typical (non-loading) minimal Type I PKS extender module is exemplified by extender 
module three of DEBS, which contains a KS domain, an AT domain, and an ACP 
domain. These three domains are sufficient to activate a 2 -carbon extender unit and attach 
it to the growing polyketide molecule. The next extender module, in turn, is responsible 
5 for attaching the next building block and transferring the growing compound to the next 
extender module until synthesis is complete. 

Once the PKS is primed with acyl- and malonyl-ACPs, the acyl group of the 
loading module is transferred to form a thiol ester (trans-esterification) at the KS of the 
first extender module; at this stage, extender module one possesses an acyl-KS and a 
10 malonyl (or substituted malonyl) ACP. The acyl group derived from the loading module 
^ is then covalently attached to the alpha-carbon of the malonyl group to form a carbon- 

carbon bond, driven by concomitant decarboxylation, and generating a new acyl- ACP 
£ that has a backbone two carbons longer than the loading building block (elongation or 

ffi extension). 

15 The polyketide chain, growing by two carbons each extender module, is 

s sequentially passed as covalently bound thiol esters from extender module to extender 

module, in an assembly line-like process. The carbon chain produced by this process 
fU alone would possess a ketone at every other carbon atom, producing a polyketone, from 

Q which the name polyketide arises. Most commonly, however, additional enzymatic 

^ 20 activities modify the beta keto group of each two carbon unit just after it has been added 
to the growing polyketide chain but before it is transferred to the next module. 

Thus, in addition to the minimal module containing KS, AT, and ACP domains 
necessary to form the carbon-carbon bond, and as noted above, other domains that 
modify the beta-carbonyl moiety can be present. Thus, modules may contain a 
25 ketoreductase (KR) domain that reduces the keto group to an alcohol. Modules may also 
contain a KR domain plus a dehydratase (DH) domain that dehydrates the alcohol to a 
double bond. Modules may also contain a KR domain, a DH domain, and an 
enoylreductase (ER) domain that converts the double bond product to a saturated single 
bond using the beta carbon as a methylene function. An extender module can also contain 
30 other enzymatic activities, such as, for example, a methylase or dimethylase activity. 
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After traversing the final extender module, the polyketide encounters a releasing 
domain that cleaves the polyketide from the PKS and typically cyclizes the polyketide. 
For example, final synthesis of 6-dEB is regulated by a TE domain located at the end of 
extender module six. In the synthesis of 6-dEB, the TE domain catalyzes cyclization of 
the macrolide ring by formation of an ester linkage. In FK-506, FK-520, rapamycin, and 
similar polyketides, the TE activity is replaced by a RapP (for rapamycin) or RapP like 
activity that makes a linkage incorporating a pipecolate acid residue. The enzymatic 
activity that catalyzes this incorporation for the rapamycin enzyme is known as RapP, 
encoded by the rapP gene. The polyketide can be modified further by tailoring enzymes; 
these enzymes add carbohydrate groups or methyl groups, or make other modifications, 
i.e., oxidation or reduction, on the polyketide core molecule. For example, 6-dEB is 
hydroxylated at C-6 and C-12 and glycosylated at C-3 and C-5 in the synthesis of 
erythromycin A. 

In Type I PKS polypeptides, the order of catalytic domains is conserved. When all 
beta-keto processing domains are present in a module, the order of domains in that 
module from N-to-C-terminus is always KS, AT, DH, ER, KR, and ACP. Some or all of 
the beta-keto processing domains may be missing in particular modules, but the order of 
the domains present in a module remains the same. The order of domains within modules 
is believed to be important for proper folding of the PKS polypetides into an active 
complex. Importantly, there is considerable flexibility in PKS enzymes, which allows for 
the genetic engineering of novel catalytic complexes. The engineering of these enzymes 
is achieved by modifying, adding, or deleting domains, or replacing them with those 
taken from other Type I PKS enzymes. It is also achieved by deleting, replacing, or 
adding entire modules with those taken from other sources. A genetically engineered 
PKS complex should of course have the ability to catalyze the synthesis of the product 
predicted from the genetic alterations made. 

Alignments of the many available amino acid sequences for Type I PKS enzymes 
has approximately defined the boundaries of the various catalytic domains. Sequence 
alignments also have revealed linker regions between the catalytic domains and at the N- 
and C-termini of individual polypeptides. The sequences of these linker regions are less 
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well conserved than are those for the catalytic domains, which is in part how linker 
regions are identified. Linker regions can be important for proper association between 
domains and between the individual polypeptides that comprise the PKS complex. One 
can thus view the linkers and domains together as creating a scaffold on which the 
5 domains and modules are positioned in the correct orientation to be active. This 

organization and positioning, if retained, permits PKS domains of different or identical 
substrate specificities to be substituted (usually at the DNA level) between PKS enzymes 
by various available methodologies. In selecting the boundaries of, for example, an AT 
replacement, one can thus make the replacement so as to retain the linkers of the recipient 

10 PKS or to replace them with the linkers of the donor PKS AT domain, or, preferably, 
make both constructs to ensure that the correct linker regions between the KS and AT 
domains have been included in at least one of the engineered enzymes. Thus, there is 
considerable flexibility in the design of new PKS enzymes with the result that known 
polyketides can be produced more effectively, and novel polyketides useful as 

1 5 pharmaceuticals or for other purposes can be made. 

By appropriate application of recombinant DNA technology, a wide variety of 
polyketides can be prepared in a/vantel^ one has access to 

nucleic acid compounds that encode PKS proteins and polyketide modification enzymes. 
The present invention helps meet the need for such nucleic acid compounds by providing 

20 recombinant vectors that encode the FK-520 PKS enzyme and various FK-520 

modification enzymes. Moreover, while the FK-506 and FK-520 polyketides have many 
useful activities, there remains a need for compounds with similar useful activities but 
with better pharmacokinetic profile and metabolism and fewer side-effects. The present 
invention helps meet the need for such compounds as well. 

25 

Summary of the Invention 
In one embodiment, the present invention provides recombinant DNA vectors that 
encode all or part of the FK-520 PKS enzyme. Illustrative vectors of the invention 
include cosmid pKOS034-120, pKOS034-124, pKOS065-C31, pKOS065-C3, pKOS065- 
30 M27, and pKOS065-M21 . The invention also provides nucleic acid compounds that 
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encode the various domains of the FK-520 PKS, i.e., the KS, AT, ACP, KR, DH, and ER 
domains. These compounds can be readily used, alone or in combination with nucleic 
acids encoding other FK-520 or non-FK-520 PKS domains, as intermediates in the 
construction of recombinant vectors that encode all or part of PKS enzymes that make 
5 novel polyketides. 

The invention also provides isolated nucleic acids that encode all or part of one or 
more modules of the FK-520 PKS, each module comprising a ketosynthase activity, an 
acyl transferase activity, and an acyl carrier protein activity. The invention provides an 
isolated nucleic acid that encodes one or more open reading frames of FK-520 PKS 
10 genes, said open reading frames comprising coding sequences for a Co A ligase activity, 
Q an NRPS activity, or two or more extender modules. The invention also provides 

recombinant expression vectors containing these nucleic acids. 
£ In another embodiment, the invention provides isolated nucleic acids that encode 

yj all or a part of a PKS that contains at least one module in which at least one of the 

15 domains in the module is a domain from a non-FK-520 PKS and at least one domain is 
from the FK-520 PKS. The non-FK-520 PKS domain or module originates from the 
08 rapamycin PKS, the FK-506 PKS, DEBS, or another PKS. The invention also provides 

Li recombinant expression vectors containing these nucleic acids. 

E3 In another embodiment, the invention provides a method of preparing a 

20 polyketide, said method comprising transforming a host cell with a recombinant DNA 
vector that encodes at least one module of a PKS, said module comprising at least one 
FK-520 PKS domain, and culturing said host cell under conditions such that said PKS is 
produced and catalyzes synthesis of said polyketide. In one aspect, the method is 
practiced with a Streptomyces host cell. In another aspect, the polyketide produced is FK- 
25 520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
506 or rapamycin. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of ethylmalonyl CoA in a heterologous host cell. These genes 
30 and the methods of the invention enable one to create recombinant host cells with the 
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ability to produce polyketides or other compounds that require ethylmalonyl CoA for 
biosynthesis. The invention also provides recombinant nucleic acids that encode AT 
domains specific for ethylmalonyl CoA. Thus, the compounds of the invention can be 
used to produce polyketides requiring ethylmalonyl CoA in host cells that otherwise are 
unable to produce such polyketides. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of 2-hydroxymalonyl CoA and 2-methoxymalonyl CoA in a 
heterologous host cell. These genes and the methods of the invention enable one to create 
recombinant host cells with the ability to produce polyketides or other compounds that 
require 2-hydroxymalonyl CoA for biosynthesis. The invention also provides 
recombinant nucleic acids that encode AT domains specific for 2-hydroxymalonyl CoA 
and 2-methoxymalonyl CoA. Thus, the compounds of the invention can be used to 
produce polyketides requiring 2-hydroxymalonyl CoA or 2-methoxymalonyl CoA in host 
cells that are otherwise unable to produce such polyketides. 

In another embodiment, the invention provides a compound related in structure to 
FK-520 or FK-506 that is useful in the treatment of a medical condition. These 
compounds include compounds in which the C- 13 methoxy group is replaced by a moiety 
selected from the group consisting of hydrogen, methyl, and ethyl moieties. Such 
compounds are less susceptible to the main in vivo pathway of degradation for FK-520 
and FK-506 and related compounds and thus exhibit an improved pharmacokinetic 
profile. The compounds of the invention also include compounds in which the C-15 
methoxy group is replaced by a moiety selected from the group consisting of hydrogen, 
methyl, and ethyl moieties. The compounds of the invention also include the above 
compounds further modified by chemical methodology to produce derivatives such as, 
but not limited to, the C-18 hydroxyl derivatives, which have potent neurotrophin but not 
immunosuppresion activities. 
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wherein, Ri is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided 
that when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen 
or hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
methyl, or ethyl; but not including FK-506, FK-520, 1 8-hydroxy-FK-520, and 18- 
hydroxy-FK-506. The invention provides these compounds in purified form and in 
pharmaceutical compositions. 

In another embodiment, the invention provides a method for treating a medical 
condition by administering a pharmaceutically efficacious dose of a compound of the 
invention. The compounds of the invention may be administered to achieve 
immunosuppression or to stimulate nerve growth and regeneration. 

These and other embodiments and aspects of the invention will be more fully 
understood after consideration of the attached Drawings and their brief description below, 
together with the detailed description, examples, and claims that follow. 

Brief Description of the Drawings 
Figure 1 shows a diagram of the FK-520 biosynthetic gene cluster. The top line 
provides a scale in kilobase pairs (kb). The second line shows a restriction map with 
selected restriction enzyme recognition sequences indicated. K is Kpnl; X is Xhol, S is 
Sacl; P is PstI; and E is £coRI. The third line indicates the position of FK-520 PKS and 
related genes. Genes are abbreviated with a one letter designation, i.e., C is fkbC. 
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Immediately under the third line are numbered segments showing where the loading 
module (L) and ten different extender modules (numbered 1 - 10) are encoded on the 
various genes shown. At the bottom of the Figure, the DNA inserts of various cosmids of 
the invention (i.e., 34-124 is cosmid pKOS034-124) are shown in alignment with the FK- 
520 biosynthetic gene cluster. 

Figure 2 shows the loading module (load), the ten extender modules, and the 
peptide synthetase domain of the FK-520 PKS, together with, on the top line, the genes 
that encode the various domains and modules. Also shown are the various intermediates 
in FK-520 biosynthesis, as well as the structure of FK-520, with carbons 13, 15, 21, and 
31 numbered. The various domains of each module and subdomains of the loading 
module are also shown. The darkened circles showing the DH domains in modules 2, 3, 
and 4 indicate that the dehydratase domain is not functional as a dehydratase; this domain 
may affect the stereochemistry at the corresponding position in the polyketide. The 
substituents on the FK-520 structure that result from the action of non-PKS enzymes are 
also indicated by arrows, together with the types of enzymes or the genes that code for 
the enzymes that mediate the action. Although the methyltransferase is shown acting at 
the C-13 and C-15 hydroxyl groups after release of the polyketide from the PKS, the 
methyltransferase may act on the 2-hydroxymalonyl substrate prior to or 
contemporaneously with its incorporation during polyketide synthesis. 

Figure 3 shows a close-up view of the left end of the FK-520 gene cluster, which 
contains at least ten additional genes. The ethyl side chain on carbon 21 of FK-520 
(Figure 2) is derived from an ethylmalonyl CoA extender unit that is incorporated by an 
ethylmalonyl specific AT domain in extender module 4 of the PKS. At least four of the 
genes in this region code for enzymes involved in ethylmalonyl biosynthesis. The 
polyhydroxybutyrate depolymerase is involved in maintaining hydroxybutyryl-CoA 
pools during FK-520 production. Polyhydroxybutyrate accumulates during vegetative 
growth and disappears during stationary phase in other Streptomyces (Ranade and 
Vining, 1993, Can. J. Microbiol. 39:377). Open reading frames with unknown function 
are indicated with a question mark. 
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Figure 4 shows a biosynthetic pathway for the biosynthesis of ethylmalonyl Co A 
from acetoacetyl Co A consistent with the function assigned to four of the genes in the 
FK-520 gene cluster shown in Figure 3. 

Figure 5 shows a close-up view of the right-end of the FK-520 PKS gene cluster 
(and of the sequences on cosmid pKOS065-C31). The genes shown include JkbD.fkbM 
(a methyl transferase that methylates the hydroxyl group on C-31 of FK-520),yftWV (a 
homolog of a gene described as a regulator of cholesterol oxidase and that is believed to 
be a transcriptional activator), jflAg (a type II thioesterase, which can increase polyketide 
production levels), and flcbS (a crotonyl-CoA reductase involved in the biosynthesis of 
ethylmalonyl CoA). 

Figure 6 shows the proposed degradative pathway for tacrolimus (FK-506) 
metabolism. 

Figure 7 shows a schematic process for the construction of recombinant PKS 
genes of the invention that encode PKS enzymes that produce 13-desmethoxy FK-506 
and FK-520 polyketides of the invention, as described in Example 4 3 below. 

Figure 8, in Parts A and B, shows certain compounds of the invention preferred 
for dermal application in Part A and a synthetic route for making those compounds in 
PartB. 

Detailed Description of the Invention 
Given the valuable pharmaceutical properties of polyketides, there is a need for 
methods and reagents for producing large quantities of polyketides, as well as for 
producing related compounds not found in nature. The present invention provides such 
methods and reagents, with particular application to methods and reagents for producing 
the polyketides known as FK-520, also known as ascomycin or L-683,590 (see Holt et 
aL, 1993, JACS 775:9925), and FK-506, also known as tacrolimus. Tacrolimus is a 
macrolide immunosuppressant used to prevent or treat rejection of transplanted heart, 
kidney, liver, lung, pancreas, and small bowel allografts. The drug is also useful for the 
prevention and treatment of graft-versus-host disease in patients receiving bone marrow 
transplants, and for the treatment of severe, refractory uveitis. There have been additional 
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reports of the unapproved use of tacrolimus for other conditions, including alopecia 
universalis, autoimmune chronic active hepatitis, inflammatory bowel disease, multiple 
sclerosis, primary biliary cirrhosis, and scleroderma. The invention provides methods and 
reagents for making novel polyketides related in structure to FK-520 and FK-506. and 
structurally related polyketides such as rapamycin. 

The FK-506 and rapamycin polyketides are potent immunosuppressants, with 
chemical structures shown below. 




rK-506 Rapamycin 

FK-520 differs from FK-506 in that it lacks the allyl group at C-21 of FK-506, having 
instead an ethyl group at that position, and has similar activity to FK-506, albeit reduced 
immunosuppressive activity. 

These compounds act through initial formation of an intermediate complex with 
protein "immunophilins" known as FKBPs (FK-506 binding proteins), including FKBP- 
12. Immunophilins are a class of cytosolic proteins that form complexes with molecules 
such as FK-506, FK-520, and rapamycin that in turn serve as ligands for other cellular 
targets involved in signal transduction. Binding of FK-506, FK-520, and rapamycin to 
FKBP occurs through the structurally similar segments of the polyketide molecules, 
known as the "FKBP-binding domain" (as generally but not precisely indicated by the 
stippled regions in the structures above). The FK-506-FKBP complex then binds 
calcineurin, while the rapamycin-FKBP complex binds to a protein known as RAFT-1 . 
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Binding of the FKBP-polyketide complex to these second proteins occurs through the 
dissimilar regions of the drugs known as the "effector" domains. 



The three component FKBP-polyketide-effector complex is required for signal 
transduction and subsequent immunosuppressive activity of FK-506, FK-520, and 
rapamycin. Modifications in the effector domains of FK-506, FK-520, and rapamycin 
that destroy binding to the effector proteins (calcineurin or RAFT) lead to loss of 
immunosuppressive activity, even though FKBP binding is unaffected. Further, such 
analogs antagonize the immunosuppressive effects of the parent polyketides, because 
they compete for FKBP. Such non-immunosuppressive analogs also show reduced 
toxicity (see Dumont et al, 1 992, Journal of Experimental Medicine 176,751 -760), 
indicating that much of the toxicity of these drugs is not linked to FKBP binding. 

In addition to immunosuppressive activity, FK-520, FK-506, and rapamycin have 
neurotrophic activity. In the central nervous system and in peripheral nerves, 
immunophilins are referred to as "neuroimmunophilins". The neuroimmunophilin FKBP 
is markedly enriched in the central nervous system and in peripheral nerves. Molecules 
that bind to the neuroimmunophilin FKBP, such as FK-506 and FK-520, have the 
remarkable effect of stimulating nerve growth. In vitro, they act as neurotrophins, i.e., 




dc- 176500 



PATENT 

AttyDkt: 300622002600 

- 14- 

they promote neurite outgrowth in NGF-treated PC 12 cells and in sensory neuronal 
cultures, and in intact animals, they promote regrowth of damaged facial and sciatic 
nerves, and repair lesioned serotonin and dopamine neurons in the brain. See Gold et aL* 
Jun. 1999, J. Pharm. Exp. Ther. 259(3): 1202-1210; Lyons et al. n 1994, Proa National 
5 Academy of Science 91: 3 191-3 195; Gold et a/., 1995, Journal ofNeuroscience 15: 7509- 
7516; and Steiner et al, 1997, Proc. National Academy of Science 94: 2019-2024. 
Further, the restored central and peripheral neurons appear to be functional. 

Compared to protein neurotrophic molecules (BNDF, NGF, etc.), the small- 
molecule neurotrophins such as FK-506, FK-520, and rapamycin have different, and 

10 often advantageous, properties. First, whereas protein neurotrophins are difficult to 

deliver to their intended site of action and may require intra-cranial injection, the small- 
molecule neurotrophins display excellent bioavailability; they are active when 
administered subcutaneously and orally. Second, whereas protein neurotrophins show 
quite specific effects, the small-molecule neurotrophins show rather broad effects. 

15 Finally, whereas protein neurotrophins often show effects on normal sensory nerves, the 
small-molecule neurotrophins do not induce aberrant sprouting of normal neuronal 
processes and seem to affect damaged nerves specifically. Neuroimmunophilin ligands 
have potential therapeutic utility in a variety of disorders involving nerve degeneration 
(e.g. multiple sclerosis, Parkinson's disease, Alzheimer's disease, stroke, traumatic spinal 

20 cord and brain injury, peripheral neuropathies). 

Recent studies have shown that the immunosuppressive and neurite outgrowth 
activity of FK-506, FK-520, and rapamycin can be separated; the neuroregenerative 
activity in the absence of immunosuppressive activity is retained by agents which bind to 
FKBP but not to the effector proteins calcineurin or RAFT, See Steiner et a/., 1997, 

25 Nature Medicine 3:421 -428. 




Nerve Regeneratiofi 
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Available structure-activity data show that the important features for neurotrophic 
activity of rapamycin, FK-520, and FK-506 lie within the common, contiguous segments 
of the macrolide ring that bind to FKBP. This portion of the molecule is termed the 
"FKBP binding domain" (see VanDuyne et al, 1993, Journal of Molecular Biology 229: 
105-124.). Nevertheless, the effector domains of the parent macrolides contribute to 
conformational rigidity of the binding domain and thus indirectly contribute to FKBP 
binding. 



There are a number of other reported analogs of FK-506, FK-520, and rapamycin that 
bind to FKBP but not the effector protein calcineurin or RAFT. These analogs show 
effects on nerve regeneration without immunosuppressive effects. 

Naturally occurring FK-520 and FK-506 analogs include the antascomycins, 
which are FK-506-like macrolides that lack the functional groups of FK-506 that bind to 
calcineurin (see Fehr et al., 1996, The Journal of Antibiotics 49: 230-233). These 
molecules bind FKBP as effectively as does FK-506; they antagonize the effects of both 
FK-506 and rapamycin, yet lack immunosuppressive activity. 




"FKBP binding domain' 



dc- 176500 



PATENT 

AttyDkt: 300622002600 



- 16 




Antascomycin A 

Other analogs can be produced by chemically modifying FK-506, FK-520, or 
rapamycin. One approach to obtaining neuroimmunophilin ligands is to destroy the 
effector binding region of FK-506, FK-520, or rapamycin by chemical modification. 
While the chemical modifications permitted on the parent compounds are quite limited, 
some useful chemically modified analogs exist. The FK-520 analog L-685,818 (ED 50 = 
0.7 nM for FKBP binding; see Dumont et al., 1992), and the rapamycin analog WAY- 
124,466 (IC 50 = 12.5 nM; see Ocain et al, 1993, Biochemistry Biophysical Research 
Communications 192: 1340-134693) are about as effective as FK-506, FK-520, and 
rapamycin at promoting neurite outgrowth in sensory neurons (see Steiner et al., 1997). 




OMe 



L-685,818 



WAY-1 24.466 



One of the few positions of rapamycin that is readily amenable to chemical 
modification is the allylic 16-methoxy group; this reactive group is readily exchanged by 
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acid-catalyzed nucleophilic substitution. Replacement of the 16-methoxy group of 
rapamycin with a variety of bulky groups has produced analogs showing selective loss of 
immunosuppressive activity while retaining FKBP-binding (see Luengo et al, 1995, 
Chemistry & Biology 2: 471-481). One of the best compounds, 1, below, shows complete 
loss of activity in the splenocyte proliferation assay with only a 10-fold reduction in 
binding to FKBP. 




There are also synthetic analogs of FKBP binding domains. These compounds 
reflect an approach to obtaining neuroimmunophilin ligands based on "rationally 
designed" molecules that retain the FKBP-binding region in an appropriate conformation 
for binding to FKBP, but do not possess the effector binding regions. In one example, the 
ends of the FKBP binding domain were tethered by hydrocarbon chains (see Holt et al., 
1993, Journal of the American Chemical Society 115: 9925-9938); the best analog, 2, 
below, binds to FKBP about as well as FK-506. In a similar approach, the ends of the 
FKBP binding domain were tethered by a tripeptide to give analog 3, below, which binds 
to FKBP about 20-fold poorer than FK-506. These compounds are anticipated to have 
neuroimmunophilin binding activity. 
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NT x — NH NH 

1 H_X) 



2 



3 



In a primate MPTP model of Parkinson's disease, administration of FKBP ligand 
GPI-1046 caused brain cells to regenerate and behavioral measures to improve. MPTP is 
a neurotoxin, which, when administered to animals, selectively damages nigral-striatal 
dopamine neurons in the brain, mimicking the damage caused by Parkinson's disease. 
Whereas, before treatment, animals were unable to use affected limbs, the FKBP ligand 
restored the ability of animals to feed themselves and gave improvements in measures of 
locomotor activity, neurological outcome, and fine motor control. There were also 
corresponding increases in regrowth of damaged nerve terminals. These results 
demonstrate the utility of FKBP ligands for treatment of diseases of the CNS. 

From the above description, two general approaches towards the design of non- 
immunosuppressant, neuroimmunophilin ligands can be seen. The first involves the 
construction of constrained cyclic analogs of FK-506 in which the FKBP binding domain 
is fixed in a conformation optimal for binding to FKBP. The advantages of this approach 
are that the conformation of the analogs can be accurately modeled and predicted by 
computational methods, and the analogs closely resemble parent molecules that have 
proven pharmacological properties. A disadvantage is that the difficult chemistry limits 
the numbers and types of compounds that can be prepared. The second approach involves 
the trial and error construction of acyclic analogs of the FKBP binding domain by 
conventional medicinal chemistry. The advantages to this approach are that the chemistry 
is suitable for production of the numerous compounds needed for such interactive 
chemistry-bioassay approaches. The disadvantages are that the molecular types of 
compounds that have emerged have no known history of appropriate pharmacological 
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properties, have rather labile ester functional groups, and are too conformationally mobile 
to allow accurate prediction of conformational properties. 

The present invention provides useful methods and reagents related to the first 
approach, but with significant advantages. The invention provides recombinant PKS 
genes that produce a wide variety of polyketides that cannot otherwise be readily 
synthesized by chemical methodology alone. Moreover, the present invention provides 
polyketides that have either or both of the desired immunosuppressive and neurotrophic 
activities, some of which are produced only by fermentation and others of which are 
produced by fermentation and chemical modification. Thus, in one aspect, the invention 
provides compounds that optimally bind to FKBP but do not bind to the effector proteins. 
The methods and reagents of the invention can be used to prepare numerous constrained 
cyclic analogs of FK-520 in which the FKBP binding domain is fixed in a conformation 
optimal for binding to FKBP. Such compounds will show neuroimmunophilin binding 
(neurotrophic) but not immunosuppressive effects. The invention also allows direct 
manipulation of FK-520 and related chemical structures via genetic engineering of the 
enzymes involved in the biosynthesis of FK-520 (as well as related compounds, such as 
FK-506 and rapamycin); similar chemical modifications are simply not possible because 
of the complexity of the structures. The invention can also be used to introduce "chemical 
handles" into normally inert positions that permit subsequent chemical modifications. 

Several general approaches to achieve the development of novel 
neuroimmunophilin ligands are facilitated by the methods and reagents of the present 
invention. One approach is to make "point mutations" of the functional groups of the 
parent FK-520 structure that bind to the effector molecules to eliminate their binding 
potential. These types of structural modifications are difficult to perform by chemical 
modification, but can be readily accomplished with the methods and reagents of the 
invention. 

A second, more extensive approach facilitated by the present invention is to 
utilize molecular modeling to predict optimal structures ab initio that bind to FKBP but 
not effector molecules. Using the available X-ray crystal structure of FK-520 (or FK-506) 
bound to FKBP, molecular modeling can be used to predict polyketides that should 
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optimally bind to FKBP but not calcineurin. Various macrolide structures can be 
generated by linking the ends of the FKBP-binding domain with "all possible" polyketide 
chains of variable length and substitution patterns that can be prepared by genetic 
manipulation of the FK-520 or FK-506 PKS gene cluster in accordance with the methods 
of the invention. The ground state conformations of the virtual library can be determined, 
and compounds that possess binding domains most likely to bind well to FKBP can be 
prepared and tested. 

Once a compound is identified in accordance with the above approaches, the 
invention can be used to generate a focused library of analogs around the lead candidate, 
to "fine tune" the compound for optimal properties. Finally, the genetic engineering 
methods of the invention can be directed towards producing "chemical handles" that 
enable medicinal chemists to modify positions of the molecule previously inert to 
chemical modification. This opens the path to previously prohibited chemical 
optimization of lead compounds by time-proven approaches. 

Moreover, the present invention provides polyketide compounds and the 
recombinant genes for the PKS enzymes that produce the compounds that have 
significant advantages over FK-506 and FK-520 and their analogs. The metabolism and 
pharmacokinetics of tacrolimus has been exstensively studied, and FK-520 is believed to 
be similar in these respects. Absorption of tacrolimus is rapid, variable, and incomplete 
from the gastrointestinal tract (Harrison's Principles of Internal Medicine, 14th edition, 
1998, McGraw Hill, 14, 20, 21, 64-67). The mean bioavailability of the oral dosage form 
is 27%, (range 5 to 65%). The volume of distribution (VolD) based on plasma is 5 to 65 
L per kg of body weight (L/kg), and is much higher than the VolD based on whole blood 
concentrations, the difference reflecting the binding of tacrolimus to red blood cells. 
Whole blood concentrations may be 12 to 67 times the plasma concentrations. Protein 
binding is high (75 to 99%), primarily to albumin and alphal-acid glycoprotein. The half- 
life for distribution is 0.9 hour; elimination is biphasic and variable: terminal-1 1.3 hr 
(range, 3.5 to 40.5 hours). The time to peak concentration is 0.5 to 4 hours after oral 
administration. 
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Tacrolimus is metabolized primarily by cytochrome P450 3 A enzymes in the liver 
and small intestine. The drug is extensively metabolized with less than 1% excreted 
unchanged in urine. Because hepatic dysfunction decreases clearance of tacrolimus, doses 
have to be reduced substantially in primary graft non-function, especially in children. In 
addition, drugs that induce the cytochrome P450 3A enzymes reduce tacrolimus levels, 
while drugs that inhibit these P450s increase tacrolimus levels. Tacrolimus bioavailability 
doubles with co-administration of ketoconazole, a drug that inhibits P450 3 A. See, 
Vincent et al, 1992, In vitro metabolism of FK-506 in rat, rabbit, and human liver 
microsomes: Identification of a major metabolite and of cytochrome P450 3A as the 
major enzymes responsible for its metabolism, Arch. Biochem. Biophys. 294: 454-460; 
Iwasaki et al, 1993, Isolation, identification, and biological activities of oxidative 
metabolites of FK-506, a potent immunosuppressive macrolide lactone, Drug Metabolism 
& Disposition 21: 911-911; Shiraga et al, 1994, Metabolism of FK-506, a potent 
immunosuppressive agent, by cytochrome P450 3 A enzymes in rat, dog, and human liver 
microsomes, Biochem. Pharmacol. 47: 727-735; and Iwasaki etal, 1995, Further 
metabolism of FK-506 (Tacrolimus); Identification and biological activities of the 
metabolites oxidized at multiple sites of FK-506, Drug Metabolism & Disposition 23: 28- 
34. The cytochrome P450 3A subfamily of isozymes has been implicated as important in 
this degradative process. 

Structures of the eight isolated metabolites formed by liver microsomes are shown 
in Figure 6. Four metabolites of FK-506 involve demethylation of the oxygens on 
carbons 13, 15, and 31, and hydroxylation of carbon 12. The 13-demethylated (hydroxy) 
compounds undergo cyclizations of the 13-hydroxy at C-10 to give MI, MVI and MVII, 
and the 12-hydroxy metabolite at C-10 to give I. Another four metabolites formed by 
oxidation of the four metabolites mentioned above were isolated by liver microsomes 
from dexamethasone treated rats. Three of these are metabolites doubly demethylated at 
the methoxy groups on carbons 15 and 3 1 (M-V), 13 and 3 1 (M-VI), and 13 and 15 (M- 
VII). The fourth, M-VIII, was the metabolite produced after demethylation of the 3 1- 
methoxy group, followed by formation of a fused ring system by further oxidation. 
Among the eight metabolites, M-II has immunosuppressive activity comparable to that of 
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FK-506, whereas the other metabolites exhibit weak or negligible activities. Importantly, 
the major metabolite of human, dog, and rat liver microsomes is the 13-demethylated and 
cyclized FK-506 (M-I). 

Thus, the major metabolism of FK-506 proceeds via 13-demethylation followed 
by cyclization to the inactive M-I, this representing about 90% of the metabolic products 
after a 10 minute incubation with liver microsomes. Analogs of tacrolimus that do not 
possess a C-13 methoxy group would not be susceptible to the first and most important 
biotransformation in the destructive metabolism of tacrolimus (i.e. cyclization of 13- 
hydroxy to C-10). Thus, a 13-desmethoxy analog of FK-506 should have a longer half- 
life in the body than does FK-506. The C-13 methoxy group is believed not to be 
required for binding to FKBP or calcineurin. The C-13 methoxy is not present on the 
identical position of rapamycin, which binds to FKBP with equipotent affinity as 
tacrolimus. Also, analysis of the 3 -dimensional structure of the FKBP-tacrolimus- 
calcineurin complex shows that the C-13 methoxy has no interaction with FKBP and only 
a minor interaction with calcineurin. The present invention provides C- 13-desmethoxy 
analogs of FK-506 and FK-520, as well as the recombinant genes that encode the PKS 
enzymes that catalyze their synthesis and host cells that produce the compounds. 

These compounds exhibit, relative to their naturally occurring counterparts, 
prolonged immunosuppressive action in vivo, thereby allowing a lower dosage and/or 
reduced frequency of administration. Dosing is more predictable, because the variability 
in FK-506 dosage is largely due to variation of metabolism rate. FK-506 levels in blood 
can vary widely depending on interactions with drugs that induce or inhibit cytochrome 
P450 3A (summarized in USP Drug Information for the Health Care Professional). Of 
particular importance are the numerous drugs that inhibit or compete for CYP 3A, 
because they increase FK-506 blood levels and lead to toxicity (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Also important are the drugs that induce P450 3A 
(e.g. Dexamethasone), because they decrease FK-506 blood levels and reduce efficacy. 
Because the major site of CYP 3A action on FK-506 is removed in the analogs provided 
by the present invention, those analogs are not as susceptible to drag interactions as the 
naturally occurring compounds. 
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Hyperglycemia, nephrotoxicity, and neurotoxicity are the most significant adverse 
effects resulting from the use of FK-506 and are believed to be similar for FK-520. 
Because these effects appear to occur primarily by the same mechanism as the 
immunosuppressive action (i.e. FKBP-calcineurin interaction), the intrinsic toxicity of the 
desmethoxy analogs may be similar to FK-506. However, toxicity of FK-506 is dose 
related and correlates with high blood levels of the drug (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Because the levels of the compounds provided by 
the present invention should be more controllable, the incidence of toxicity should be 
significantly decreased with the 13-desmethoxy analogs. Some reports show that certain 
FK-506 metabolites are more toxic than FK-506 itself, and this provides an additional 
reason to expect that a CYP 3 A resistant analog can have lower toxicity and a higher 
therapeutic index. 

Thus, the present invention provides novel compounds related in structure to FK- 
506 and FK-520 but with improved properties. The invention also provides methods for 
making these compounds by fermentation of recombinant host cells, as well as the 
recombinant host cells, the recombinant vectors in those host cells, and the recombinant 
proteins encoded by those vectors. The present invention also provides other valuable 
materials useful in the construction of these recombinant vectors that have many other 
important applications as well. In particular, the present invention provides the FK-520 
PKS genes, as well as certain genes involved in the biosynthesis of FK-520 in 
recombinant form. 

FK-520 is produced at relatively low levels in the naturally occurring cells, 
Streptomyces hygroscopicus var. ascomyceticus, in which it was first identified. Thus, 
another benefit provided by the recombinant FK-520 PKS and related genes of the 
present invention is the ability to produce FK-520 in greater quantities in the recombinant 
host cells provided by the invention. The invention also provides methods for making 
novel FK-520 analogs, in addition to the desmethoxy analogs described above, and 
derivatives in recombinant host cells of any origin. 

The biosynthesis of FK-520 involves the action of several enzymes. The FK-520 
PKS enzyme, which is composed of the JkbA,fkbB,JkbC, and fkbP gene products, 
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synthesizes the core structure of the molecule. There is also a hydroxylation at C-9 
mediated by the P450 hydroxylase that is the JkbD gene product and that is oxidized by 
iheflcbO gene product to result in the formation of a keto group at C-9. There is also a 
methylation at C-3 1 that is mediated by an O-methyltransferase that is the JkbMgene 
product. There are also methylations at the C-13 and C-15 positions by a 
methyltransferase believed to be encoded by the fkbG gene; this methyltransferase may 
act on the hydroxymalonyl CoA substrates prior to binding of the substrate to the AT 
domains of the PKS during polyketide synthesis. The present invention provides the 
genes encoding these enzymes in recombinant form. The invention also provides the 
genes encoding the enzymes involved in ethylmalonyl CoA and 2-hydroxymalonyl CoA 
biosynthesis in recombinant form. Moreover, the invention provides Streptomyces 
hygroscopicus var. ascomyceticus recombinant host cells lacking one or more of these 
genes that are useful in the production of useful compounds. 

The cells are useful in production in a variety of ways. First, certain cells make a 
useful FK-520-related compound merely as a result of inactivation of one or more of the 
FK-520 biosynthesis genes. Thus, by inactivating the C-31 O-methyltransferase gene in 
Streptomyces hygroscopicus var. ascomyceticus, one creates a host cell that makes a 
desmethyl (at C-31) derivative of FK-520. Second, other cells of the invention are unable 
to make FK-520 or FK-520 related compounds due to an inactivation of one or more of 
the PKS genes. These cells are useful in the production of other polyketides produced by 
PKS enzymes that are encoded on recombinant expression vectors and introduced into 
the host cell. 

Moreover, if only one PKS gene is inactivated, the ability to produce FK-520 or 
an FK-520 derivative compound is restored by introduction of a recombinant expression 
vector that contains the functional gene in a modified or unmodified form. The 
introduced gene produces a gene product that, together with the other endogenous and 
functional gene products, produces the desired compound. This methodology enables one 
to produce FK-520 derivative compounds without requiring that all of the genes for the 
PKS enzyme be present on one or more expression vectors. Additional applications and 
benefits of such cells and methodology will be readily apparent to those of skill in the art 
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after consideration of how the recombinant genes were isolated and employed in the 
construction of the compounds of the invention. 

The FK-520 biosynthetic genes were isolated by the following procedure. 
Genomic DNA was isolated from Streptomyces hygroscopicus war. ascomyceticus 
(ATCC 14891) using the lysozyme/proteinase K protocol described in Genetic 
Manipulation of Streptomyces - A Laboratory Manual (Hopwood et al, 1986). The 
average size of the DNA was estimated to be between 80- 120 kb by electrophoresis on 
0.3% agarose gels. A library was constructed in the SuperCos™ vector according to the 
manufacturer's instructions and with the reagents provided in the commercially available 
kit (Stratagene). Briefly, 100 ug of genomic DNA was partially digested with 4 units of 
Saul A I for 20 min. in a reaction volume of 1 mL, and the fragments were 
dephosphorylated and ligated to SuperCos vector arms. The ligated DNA was packaged 
and used to infect log-stage XLl-BlueMR cells. A library of about 10,000.independent 
cosmid clones was obtained. 

Based on recently published sequence from the FK-506 cluster (Motamedi and 
Shafiee, 1998, Eur. J. Biochem. 256: 528), a probe for the fkbO gene was isolated from 
ATCC 14891 using PCR with degenerate primers. With this probe, a cosmid designated 
pKOS034-124 was isolated from the library. With probes made from the ends of cosmid 
pKOS034-124, an additional cosmid designated pKOS034-120 was isolated. These 
cosmids (pKOS034-124 and pKOS034-120) were shown to contain DNA inserts that 
overlap with one another. Initial sequence data from these two cosmids generated 
sequences similar to sequences from the FK-506 and rapamycin clusters, indicating that 
the inserts were from the FK-520 PKS gene cluster. Two EcoBI fragments were 
subcloned from cosmids pKOS034-124 and pKOS034-120. These subclones were used 
to prepare shotgun libraries by partial digestion with Saui AI, gel purification of 
fragments between 1.5 kb and 3 kb in size, and ligation into the pLitmus28 vector (New 
England Biolabs). These libraries were sequenced using dye terminators on a Beckmann 
CEQ2000 capillary electrophoresis sequencer, according to the manufacturer's protocols. 

To obtain cosmids containing sequence on the left and right sides of the 
sequenced region described above, a new cosmid library of ATCC 14891 DNA was 
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prepared essentially as described above. This new library was screened with a newflcbM 
probe isolated using DNA from ATCC 14891 . A probe representing the flcbP gene at the 
end of cosmid pKOS034-124 was also used. Several additional cosmids to the right of the 
previously sequenced region were identified. Cosmids pKOS065-C31 and pKOS065-C3 
5 were identified and then mapped with restriction enzymes. Initial sequences from these 
cosmids were consistent with the expected organization of the cluster in this region. More 
extensive sequencing showed that both cosmids contained in addition to the desired 
sequences, other sequences not contiguous to the desired sequences on the host cell 
chromosomal DNA. Probing of additional cosmid libraries identified two additional 

1 0 cosmids, pKOS065-M27 and pKOS065-M2 1 , that contained the desired sequences in a 
contiguous segment of chromosomal DNA. Cosmids pKOS034-124, pKOS034-120, 
pKOS065-M27, and pKOS065-M21 have been deposited with the American Type 
Culture Collection, Manassas, VA, USA. The complete nucleotide sequence of the 
coding sequences of the genes that encode the proteins of the FK-520 PKS are shown 

1 5 below but can also be determined from the cosmids of the invention deposited with the 
ATCC using standard methodology. 

Referring to Figures 1 and 3, the FK-520 PKS gene cluster is composed of four 
open reading frames designated JkbB,jkbC,flcbA, and fkbP. The fkbB open reading frame 
encodes the loading module and the first four extender modules of the PKS. The fkbC 

20 open reading frame encodes extender modules five and six of the PKS. The flcbA open 
reading frame encodes extender modules seven, eight, nine, and ten of the PKS. The fkbP 
open reading frame encodes the NRPS of the PKS. Each of these genes can be isolated 
from the cosmids of the invention described above. The DNA sequences of these genes 
are provided below preceded by the following table identifying the start and stop codons 

25 of the open reading frames of each gene and the modules and domains contained therein. 

Nucleotides Gene or Domain 

complement (412 - 1836) flcbW 

complement (2020 - 3579) fkbV 

30 complement (3969 - 4496) fkbR2 

complement (4595 - 5488) fkbRl 

5601 -6818 flcbE 
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6808 - 8052 
8156-8824 

complement (9122 - 9883) 
complement (9894 - 10994) 
5 complement (1 0987 - 1 1 247) 
complement (1 1244 - 12092) 
complement (121 13 - 13150) 
complement (13212 - 23988) 
complement (23992 - 46573) 

10 46754-47788 
47785 - 52272 
52275-71465 
71462 - 72628 
72625 - 73407 

1 5 complement (73460 - 76202) 
complement (76336 - 77080) 
complement (77076 - 77535) 
complement (44974 - 46573) 
complement (43777 - 44629) 

20 complement (43 1 44 - 43 660) 
complement (41842 - 43093) 
complement(40609 - 41842) 
complement (39442 - 40609) 
complement (38677 - 39307) 

25 complement (38371 - 38581) 
complement (37145 - 38296) 
complement (35749 - 37144) 
complement (34606 - 35749) 
complement (33823 - 34480) 

30 complement (33505 -33715) 
complement (32185 - 33439) 
complement (31018-32185) 
complement (29869 - 31018) 
complement (29092 - 29740) 

35 complement (28750 - 28960) 
complement (27430 - 28684) 
complement (26146 - 27430) 
complement (24997 - 26146) 
complement (24163 - 24373) 

40 complement (22653 - 23892) 
complement (21420 - 22653) 
complement (20241 - 21420) 
complement (19464 - 20097) 
complement (191 16 - 19326) 




fkbM 
flcbN 

flcbQ 
JkbS 

Co A ligase of loading domain 

ER of loading domain 

ACP of loading domain 

KS of extender module 1 (KS1) 

ATI 

DH1 

KR1 

ACPI 

KS2 

AT2 

DH2 (inactive) 

KR2 

ACP2 

KS3 

AT3 

DH3 (inactive) 

KR3 

ACP3 

KS4 

AT4 

DH4 (inactive) 

ACP4 

KS5 

AT5 

DH5 

KR5 

ACP5 
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complement (17820 - 19053) 

complement (16587 - 17820) 

complement (15438 - 16587) 

complement (14517 - 15294) 

complement (13761 - 14394) 

complement (13452 - 13662) 

52362 - 53576 

53577-54716 

54717-55871 

56019-56819 
7S694J - 57575 
^477J10757929/ 

57990^59243 

59244 - 60398 

60399 - 61412 — 

61548 -62180 
62328 - 62537 
62598 - 63854 
63855 - 65084 
65085 - 66254 
66399-67175 
67299-67931 
68094 - 68303 
68397 - 69653 
69654 - 70985 
71064-71273 

1 GATCTCAGGC ATGAAGTCCT CCAGGCGAGG CGCCGAGGTG GTGAACACCT CGCCGCTGCT 
61 TGTACGGACC ACTTCAGTCA GCGGCGATTG CGGAACCAAG TCATCCGGAA TAAAGGGCGG 
121 TTACAAGATC CTCACATTGC GCGACCGCCA GCATACGCTG AGTTGCCTCA GAGGCAAACC 
181 GAAAGGGCGC GGGCGGTCCG CACCAGGGCG GAGTACGCGA CGAGAGTGGC GCACCCGCGC 
241 ACCGTCACCT CTCTCCCCCG CCGGCGGGAT GCCCGGCGTG ACACGGTTGG GCTCTCCTCG 
301 ACGCTGAACA CCCGCGCGGT GTGGCGTCGG GGACACCGCC TGGCATCGGC CGGGTGACGG 
361 TACGGGGAGG GCGTACGGCG GCCGTGGCTC GTGCTCACGG CCGCCGGGCG GTCATCCGTC 
421 GAGACGGCAC TCGGCGAGCA GGGACGCCTG GTCGGCACCT GCGGGCCGGA CGACCGTGTG 
481 GTTCGCGGGC GGGCGGTGGC CGGTGGTGAG CCAGCTCTCC AGGGCGGTGA AGGCTGAGCG 
541 GTGACACGGC AGCAAAGGCC GGAGTCGGTC GGGGAAGGTG TCGACGAGGG CGTCGGTGTG 
601 CGTGCCGTCC TCGATGCGGT AGTAGCGGTA CCGGCCGCCA GGCCGCTGCC GGACATACGC 
661 GCGTACACGT CGGAGCCCGG GCGGCAGGCA GCAGCACGTC GAGAGTGCCT GGATGGTGAT 
721 CAGCGGCTTG CCGATACGAC CGGTCAACGC GATGCGTTCC ACGGCCGCGT GGACGCCGGA 
781 GGAGCGGGTG GCGTAGTCGT AGTCGGCATC GCAGCCCGGG ACCGTCCCCG GGGCGCAATA 
841 CGGTGTGCCG GCTTCCTTCT CCCCATCGAA GCCGGGGTCG AACTCCTCGC GGTAGACGCG 
901 CTGCGTCAGA TCCCAGTAGA CCTCGTGGTG GTACGGCCAC AAGAACTCGG AGTCGGCCGG 
961 GAACCCGGCG CGGAGCAGCG CCTCGCGCGC CTGGCCGGCT GCGGGGCCGC CTGCCGCGTA 
1021 GGTGGGGTAG TCGCGCAGGG CGGCCGGCAG GAAGGTGAAG AGGTTGGGAC CCTCCGCGCG 
1081 CCACAGGGTG CCTTCCCAGT CGACTCCTCC GTCGTACAGC TCGGGATGGT TCTCCAGCTG 
1141 CCAGCGCACG AGGTAGCCGC CGTTGGACAT CCCGGTGACC AGGGTGCGCT CGAGCGGCCG 
1201 GTGGTAGCGC TGGGCGACCG ACGCGCGGGC GGCCCGGGTC AGCTGGGTGA GGCGGGTGTT 



AT10 
ACP10 
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12 61 CCACTCGGCG ACGGCGTCGC CCGGCCGGGA GCCATCACGG TAGAACGCGG GGCCGGTGTT 
1321 GCCCTTGTCG GTGGCGGCGT AGGCGTAACC GCGGGCGAGC ACCCAGTCGG CGATGGCCCG 
1381 GTCGTTGGCG TACTGCTCGC GGTTACCGGG GGTGCCGGCC ACGACCAGGC CACCGTTCCA 
14 41 GCGGTCGGGC AGCCGGATGA CGAACTGGGC GTCGTGGTTC CACCCGTGGT TGGTGTTGGT 
5 1501 GGTGGAGGTG TCGGGGAAGT AGCCGTCGAT CTGGATCCCG GGCACTCCGG TGGGAGTGGC 
1561 CAGGTTCTTG GGCGTCAGCC CTGCCCAGTC CGCCGGGTCG GTGTGGCCGG TGGCCGCCGT 
1621 TCCCGCCGTG GTCAGCTCGT CCAGGCAGTC GGCCTGCTGA CGTGCCGCCG CCGGGACACG 
1681 CAGCTGGGAC AGACGGGCGC AGTGACCGTC CGGGGCATCG GGAGCAGGCC GGGCCGTGGC 

17 41 CGGTGAGGGG AGCAGGACGG CGACTGCGGC CAGGGTGAGA GCGCCGAGGC CGGTGCGTCT 
10 1801 TCTCGGGGCC CGTCCGACAC CGAGGGGCAG AACCATGGAG AGCCTCCAGA CGTGCGGATG 

18 61 GATGACGGAC TGGAGGCTAG GTCGCGCACG GTGGAGACGA ACATGGGTGC GCCCGCCATG 
1921 ACTGAGGCCC CTCAGAGGTG GGCCGCCGCC ATGACGGGCG CGGGACCGCG GGCGCTCCGG 
1981 GGCGGTGCCC GCGGCCGCCA CCGGTTCCGG GTCCCCGGGT CAGGGACAGG TGTCGTTCGC 
2041 GACGGTGAAG TAGCCGGTCG GCGACTCTTT CAAGGTGGTC GTGACGAAGG TGTTGTACAG 

15 2101 GCCCATGTTC TGGCCGGAGC CCTTGGCGTA GGTGTAACCG GCGCTCGTCG TGGCGCGGCC 
2161 CGCCTGGACG TGAGCGTAGT TGCCGGCGGT CCAGCAGACG GCCGTGGCAC CGGTCGTCTG 
2221 CGCGGTGACC GCGCCCGAGA GCGGTCCGGC CTTGCCGTCC GCGTCCCGGG CGGCGACCGC 
2281 GTAGGTGTGC GATGTGCCCG CCCTCAGGCC GGTGTCCGTG TACGACGTCG TGGCGGACGT 
M 2341 GGTGATCTGG GCACCGTCGC GGTGGACGGC GTAGTCGGTG GCGCCGTCGA CGGGTTTCCA 

hO 20 24 01 GGTCAGGCTG ATGGTGGTGT CGGTGGCGCC GGTGGCGGCC AGGCCGGACG GAGCGGGCAG 
yQ 24 61 CGAACCGGGG TCGGAGGCGG ATCCGCTCAG GCCGAAGAAC TGCGTGATCC AGTAGCTGGA 

Jp 2521 ACAGATCGAG TCCAGGAAGT AGGCGGCGCC GGTGCTGCCG CACTGCTGTG CTCCGGTGCC 

O 2581 GGGATCGACC GGGGTGCCGT GCCCGATGCC CGGCACCCGG TTCACCTCCA CGGCCACCGA 

|y 2^41 TCCGTCCGCG GCCAGGTACT CCTCGTGCCG GGTGGAGTTC GGGCCGATCA CCGAGGTACG 

£7 25 27 01 GTCCGGCGTC TGGGACACGC CGTGCACAGC GGTCCACTGG TCGCGCAACT CGTCGGCGTT 
ffl 2761 GCGCGGCGCG ACGGTGGTGT CCTTGTCGCC GTGCCAGATG GCCACGCGCG GCCACGGGCC 

^ 2 821 CGACCACGAG GGGTAGCCGT CACGGACCCG CCGCGCCCAC TGGTCCGCGG TCAGGTCGGT 

L 2881 CCCGGGGTTC ATGCACAGGT ACGCGCTGCT GACGTCGGTG GCACAGCCGA AGGGCAGGCC 

2941 GGCGACGACC GCGCCGGCCT GGAAGACGTC CGGATAGGTG GCGAGCATCA CCGACGTCAT 
yj 30 3001 GGCACCGCCG GCGGACAGCC CGGTGATGTA GGTGCGCTGG GGGTCCGCGC CGTAGGCGGA 
TU 3061 GACGGTGTGA GCGGCCATCT GCCGGATCGA CGCGGCTTCG CCCTGGCCCC TGCGGTTGTC 

'H ' 312 1 GCTGCTCTGG AACCAGTTGA AGCACCTGTT CGCGTTGTTC GACGACGTGG TCTCGGCGAA 

Q 3181 CACGAGCAGG AAGCCATAGC GGTCCGCGAA TGAGAGCAGG CCGGAGTTGT CGGCGTAGCC 

M 3241 CTGGGCGTCC TGGGTGCAAC CGTGCAGGGC GAACACCACC GCCGGCTCCG CGGGCAGGGA 

35 3301 CGCGGGCCGG TAGACGTACA TGTTCAGCCG GCCCGGGTTC GTGCCGAAGT CCGCGACCTC 
3361 GGTCAGGTCC GCCTTGGTCA GACCGGGCTT GGCCAGGCCC GCCGCGGCGH GGGCCGTCGG 
3421 CGCCGGGCCG AGCAGGGCCG CTCCGAGTAC GAGGGCCACG ACGGCCACGA GACGGGTGAG 
34 81 CACCCCCCGC CGTCCCGGAC GCGACAACGA CCCGACCGGC GGCGAGGAGG AGAGGGGGAA 
3541 CAGCGGGGTG AGGATTCCCC GGAACGGCGG CGGCTGCATG GCGGCTCCCT CGATGTCGTG 
40 3601 GGGGGGACAC GGAGGGCTCC CTGACGTCGA TCAGTGGGAG CGCCCCGGTG CCCGGCACCG 
3661 TAGGGGTGGT TCAACCCGCA ACGGTATGGC CCGGAGCACC ACACCCCGCA CCGCGCGATG 
3721 TGCGCCCGGA CGGATTGTGT CGCCTTGCGG AATCTGATAC CCGGACGCGA CGAACGCCCC 
37 81 ACCCGACACG GGTAGGGCGT CATGGTGTCC GACTCGGCCG GTCGGCCTTG CCTGCCCTGG 
3841 ACGGACCGGG CGTCGGCGGA CCGGGCGTCG GCGGGCTGGG CGGTATGGCG GCCGAGGACG 
45 3901 CCAGCCGCGT GGGGCGGCCG CGCCCAAGTG CAGTACGCCG ACCGTGGCCG GCGGGAGGGC 
3961 CGGACCGGTC AGTGCAGTCC CGCGGCCC7G CGGGACCGCT CGTCCCAGAC GGGTTCCACC 
4 021 GCGGCGAACC GGGGTCCGTG TCCGCGGCGG TAGACCATCA GTGTCCGCTC GAAGGTGATG 
4 081 ACGATGACAC CGTCCTGGTT GTAGCCGATG GTGCGCACGC TGATGATGCC TACGTCAGGT 
4141 CGGCTGGCGG ACTCCCGGGT GTTCAGGACC TCGGACTGCG AGTAGATGGT GTCGCCCTCG 
50 4 201 AAGACCGGGT TCGGCAGCCT GACCCGGTCC CAGCCGAGGT TGGCCATCAC ATGCTGGGAG 
4261 ATGTCGGTGA CGCTCTGCCC GGTGACCAGG GCGAGGGTGA AGGTGGAGTC CACCAGCGGC 
4 321 TTGCCCCAGG TGGTGCCCGC CGAGTAGTGG CGGTCGAAGT GCAGCGGCGC GGTGTTCTGC 
4 381 GTCAGGAGCG TGAGCCAGGA GTTGTCGGTC TCCAGGACCG TGCGGCCCAG GGGGTGGCGG 
4 441 TACACGTCGC CGGTGGTGAA GTCCTCGAAG TAGCGGCCCT GCCAGCCCTC GACCACAGCG 
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4501 GTGCGGGTGG CGTCCTGGTC CGGGTTCTCA GTCGTCATGG CGCTCATTCT GGGAAGTCCC 
4 561 CGGTCCGCTG TGAAATGCCG AACCTTCACC GGGCTCATAC GTGCGGCGCA TGAGCCCTGG 
4 621 ACCGTACGTA GTCGTAGAAC CTCGCCACCA CTGGCGCGCG TGGTCCTCCG GCGAGTGTGA 
4 681 CCACGCCGAC CGTGCGCCGC GCCTGCGGGT CGTCGAGCGG CACGGCGACG GCGTGGTCAC 
5 4 741 CGGGCCCGGA CGGGCTGCCG GTGAGGGGGG CGACGGCCAC ACCGAGGCCG GCGGCGACCA 
4 8 01 GGGCCCGCAG CGTGCTCAGC TCGGTGCTCT CCAGGACGAC CCGCGGCACG AATCCGGCCG 
4 8 61 CGGCGCACAG CCGGTCGGTG ATCTGGCGCA GTCCGAAGAC CGGCTCCAGT GCCACGAACG 
4 921 CCTCATCGGC CAGCTCCGCG GTCCGCACCC GGCGGCGTCT GGCCAGCCGG TGTCCGGGTG 
4 981 GGACGAGCAG GCACAGTGCC TCGTCCCGCA GTGGTGTCCA CTCCACATCG TCCCCGGCGG 

10 5041 GTCGTGGGCT GGTCAGCCCC AGGTCCAGCC TGCTGTTGCG GACGTCGTCG ACCACGGCGT 
5101 CGGCGGCGTC GCCGCGCAGT TCGAAGGTGG TGCCGGGAGC CAGCCGGCGG TACCCGGCGA 
5161 GGAGGTCGGG CACCAGCCAG GTGCCGTAGG AGTGCAGGAA ACCCAGTGCC ACGGTGCCGG 
5221 TGTCGGGGTC GATCAGGGCG GTGATGCGCT GCTCGGCGCC GGAGACCTCA CTGATCGCGC 
5281 GCAGGGCGTG GGCGCGGAAG ACCTCGCCGT ACTTGTTGAG CCGGAGCCGG TTCTGGTGCC 

15 5341 GGTCGAACAG CGGCACGCCC ACTCGTCGCT CCAGCCGCCG GATGGCCCTG GACAGGGTCG 
54 01 GCTGGGAGAT GTTGAGCCGT TCCGCGGTGA TCGTCACGTG CTCGTGCTCG GCCAAGGCCG 
54 61 TGAACCACTG CAACTCCCGT ATCTCCATGC AGGGACTATA CGTACCGGGC ATGGTCCTGG 
5521 CGAGGTTTCG TCATTTCACA GCGGCCGGGC GGCGGCCCAC AGTGAGTCCT CACCAACCAG 
5581 GACCCCATGG GAGGGACCCC ATGTCCGAGC CGCATCCTCG CCCTGAACAG GAACGCCCCG 

20 5641 CCGGGCCCCT GTCCGGTCTG CTCGTGGTTT CTTTGGAGCA GGCCGTCGCC GCTCCGTTCG 
57 01 CCACCCGCCA CCTGGCGGAC CTGGGCGCCC GTGTCATCAA GATCGAACGC CCCGGCAGCG 

57 61 GCGACCTCGC CCGCGGCTAC GACCGCACGG TGCGTGGCAT GTCCAGCCAC TTCGTCTGGC 
5821 TGAACCGGGG GAAGGAGAGC GTCCAGCTCG ATGTGCGCTC GCCGGAGGGC AACCGGCACC 

58 81 TGCACGCCTT GGTGGACCGG GCCGATGTCC TGGTGCAGAA TCTGGCACCC GGCGCCGCGG 
25 5941 GCCGCCTGGC ATCGGCCACC AGGTCCTCGC GCGGAGCCAC CGAGGCTGAT CACCTGCGGA 

6001 CATATCCGGC TACGGCAGTA CCGGCTGCTA CCGCGGACCG CAAGGCGTAC GACCTCCTGG 
6061 TCCAGTGCGA AGCGGGGCTG GTCTCCATCA CCGGCACCCC CGAGACCCCG TCCAAGGTGG 
6121 GCCTGTCCAT CGCGGACATC TGTGCGGGGA TGTACGCGTA CTCCGGCATC CTCACGGCCC 
6181 TGCTGAAGCG GGCCCGCACC GGCCGGGGCT CGCAGTTGGA GGTCTCGATG CTCGAAGCCC 

30 6241 TCGGTGAATG GATGGGATAC GCC GAG TACT ACACGCGCTA CGGCGGCACC GCTCCGGCCC 
6301 GCGCCGGCGC CAGCCACGCG ACGATCGCCC CCTACGGCCC GTTCACCACG CGCGACGGGC 
6361 AGACGATCAA TCTCGGGCTC CAGAACGAGC GGGAGTGGGC TTCCTTCTGC GGTGTCGTGC 
64 21 TACAACGCCC CGGTCTCTGC GACGACCCGC GCTTTTCCGG CAACGCCGAC CGGGTGGCGC 
64 81 ACCGCACCGA GCTCGACGCC CTGGTGAGCG AGGTGACGGG CACGCTCACC GGCGAGGAAC 

35 6541 TGGTGGCGCG GCTGGAGGAG GCGTCGATCG CCTACGCACG CCAGCGCACC GTGCGGGAGT 
6601 TCAGCGAACA CCCCCAACTG CGTGACCGTG GACGCTGGGC TCCGTTCGAC AGCCCGGTCG 
6661 GTGCGCTGGA GGGCCTGATC CCCCCGGTCA CCTTCCACGG CGAGCACCCG CGGCGGCTGG 
6721 GCCGGGTCCC GGAGCTGGGC GAGCATACCG AGTCCGTCCT GGCGTGGCTG GCCGCGCCCC 
67 81 ACAGCGCCGA CCGCGAAGAG GCCGGCCATG CCGAATGAAC TCACCGGAGT CCTGATCCTG 

40' 6841 GCCGCCGTGT TCCTGCTCGC CGGCGTACGG GGGCTGAACA TGGGCCTGCT CGCGC^GGTC 
6901 GCCACCTTTC TGCTCGGGGT GGTCGCACTC GACCGAACGC CGGACGAGGT GCTGGCGGGT 
6961 TTCCCCGCGA GCATGTTCCT GGTGCTGGTC GCCGTCACGT TCCTCTTCGG GATCGCCCGC 
7021 GTCAACGGCA CGGTGGACTG GCTGGTACGT GTCGCGGTGC GGGCGGTGGG GGCCCGGGTG 
7081 GGAGCCGTCC CCTGGGTGCT CTTCGGCCTG GCGGCACTGC TCTGCGCGAC AGGCGCGGCC 

45 7141 TCGCCCGCGG CGGTGGCGAT CGTGGCGCCG ATCAGCGTCG CGTTCGCCGT CAGGCACCGC 
72 01 ATCGATCCGC TGTACGCCGG ACTGATGGCG GTGAACGGGG CCGCAGCCGG CAGTTTCGCC 
72 61 CCCTCCGGGA TCCTGGGCGG CATCGTCCAC TCGGCGCTGG AGAAGAACCA TCTGCCCGTC 
7321 AGCGGCGGGC TGCTCTTCGC AGGCACCTTC GCCTTCAACC TGGCGGTCGC CGCGGTGTCA 
7381 TGGCTCGTCC TCGGGCGCAG GCGCCTCGAA CCACATGACC TGGACGAGGA CACCGATCCC 

50 74 41 ACGGAAGGGG ACCCGGCTTC CCGCCCCGGC GCGGAACACG TGATGACGCT GACCGCGATG 
7 501 GCCGCGCTGG TGCTGGGAAC CACGGTCCTC TCCCTGGACA CCGGCTTCCT GGCCCTCACC 
7 5 61 TTGGCGGCGT TGCTGGCGCT GCTCTTCCCG CGCACCTCCC AGCAGGCCAC CAAGGAGATC 
7 621 GCCTGGCCCG TGGTGCTGCT GGTATGCGGG ATCGTGACCT ACGTCGCCCT GCTCCAGGAG 
7 681 CTGGGCATCG TGGACTCCCT GGG GAAGATG ATCGCGGCGA TCGGCACCCC GCTGCTGGCC 
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7741 GCCCTGGTGA TCTGCTACGT GGGCGGTGTC GTCTCGGCCT TCGCCTCGAC CACCGGGATC 
7801 CTCGGTGCCC TGATGCCGCT GTCCGAGCCG TTCCTGAAGT CCGGTGCCAT CGGGACGACC 
7861 GGCATGGTGA TGGCCCTGGC GGCCGCGGCG ACCGTGGTGG ACGCGAGTCC CTTCTCCACC 
7 921 AATGGTGCTC TGGTGGTGGC CAACGCTCCC GAGCGGCTGC GGCCCGGCGT GTACCAGGGG 

7 981 TTGCTGTGGT GGGGCGCCGG GGTGTGCGCA CTGGCTCCCG CGGCCGCCTG GGCGGCCTTC 
8041 GTGGTGGCGT GAGCGCAGCG GAGCGGGAAT CCCCTGGAGC CCGTTTCCCG TGCTGTGTCG 
8101 CTGACGTAGC GTCAAGTCCA CGTGCCGGGC GGGCAGTACG CCTAGCATGT CGGGCATGGC 
8161 TAATCAGATA ACCCTGTCCG ACACGCTGCT CGCTTACGTA CGGAAGGTGT CCCTGCGCGA 
8221 TGACGAGGTG CTGAGCCGGC TGCGCGCGCA GACGGCCGAG CTGCCGGGCG GTGGCGTACT 
8281 GCCGGTGCAG GCCGAGGAGG GACAGTTCCT CGAGTTCCTG GTGCGGTTGA CCGGCGCGCG 
8341 TCAGGTGCTG GAGATCGGGA CGTACACCGG CTACAGCACG CTCTGCCTGG CCCGCGGATT 
84 01 GGCGCCCGGG GGCCGTGTGG TGACGTGCGA TGTCATGCCG AAGTGGCCCG AGGTGGGCGA 

8 4 61 GCGGTACTGG GAGGAGGCCG GGGTTGCCGA CCGGATCGAC GTCCGGATCG GCGACGCCCG 
8521 GACCGTCCTC ACCGGGCTGC TCGACGAGGC GGGCGCGGGG CCGGAGTCGT TCGACATGGT 

15 8581 GTTCATCGAC GCCGACAAGG CCGGCTACCC CGCCTACTAC GAGGCGGCGC TGCCGCTGGT 
8 641 ACGCCGCGGC GGGCTGATCG TCGTCGACAA CACGCTGTTC TTCGGCCGGG TGGCCGACGA 
8 701 AGCGGTGCAG GACCCGGACA CGGTCGCGGT ACGCGAACTC AACGCGGCAC TGCGCGACGA 
8761 CGACCGGGTG GACCTGGCGA TGCTGACGAC GGCCGACGGC GTCACCCTGC TGCGGAAACG 
8821 GTGACCGGGG CGATGTCGGC GGCGGTCAGC GTCAGCGTCG TCGGCGCGGG CCTCGCGGAG 
8881 GGCTCCAGAT GCAGGCGTTC GACGCCGGCG GCGGAAGCGC CCGCCACCTC GGACACGCAG 
*J 8941 GGGCAGTCGG AGTCCGCGAA GCCCGCGAAC CGGTAGGCGA TCTCCATCAT GCGGTTGCGG 

£ 9001 TCCGTACGCC GGAAGTCCGC CACCAGGTGC GCCCCCGCGC GGGCGCCCTG GTCCGTGAGC 

0 9061 CAGTTCAGGA TCGTCGCACC GGCACCGAAC GACACGACCC GGCAGGACGT GGCGAGCAGT 

UJ 9121 TTCAGGTGCC ACGTCGACGG CTTCTTCTCC AGCAGGATGA TGCCGACGGC GCCGTGCGGG 

- 25 9181 CCGAAGCGGT CGCCCATGGT GACGACGAGG ACCTCATGGG CGGGATCGGT GAGCACGCGC 
' m 9241 GCAGGTCGGC GTCGGAGTAG TGCACGCCGG TCGCGTTCAT CTGGCTGGTC CGCAGCGTCA 

r 9301 GTTCCTCGAC GCGGCTGAGT TCCTCCTCCC CCGCGGGTGC GATCGTCATG GAGAGGTCGA 

L 9361 GCGAGCGCAG GAAGTCCTCG TCGGGACCGG AGTACGCCTC CCGGGCCTGG TCGCGCGCGA 

W 9421 AACCCGCCTG GTACATCAGG CGGCGCCGAC GCGAGTCGAC CGTGGACACC GGCGGGCTGA 

y 30 94 81 ACTCCGGCAG CGACAGGAGC GTGGCCGCCT GCTCGGCCGG GTAGCACCGC ACCTCGGGCA 
m 9541 GGTGGAACGC CACCTCGGCA CGCTCGGCGG GCTGGTCGTC GATGAACGCG ATCGTGGTCG 

%| 9601 GTGCGAAGTT CAGCTCCGTG GCGATCTCGC GGACGGACTG CGACTTCGGC CCCCATCCGA 

0 9661 TGCGGGCCAG CACGAAGTAC TCCGCCACAC CGAGGCGTTC CAGACGCTCC CACGCGAGGT 

9721 CGTGGTCGTT CTTGCTCGCC ACCGCCTGGA GGATGCCGCG GTCGTCGAGC GTGGTGATCA 
.35 9781 CCTCGCGGAT CTCGTCGGTG AGGACCACCT CGTCGTCCTC CAGCACGGTG CCCCGCCACA 
9841 AGGTGTTGTC CAGGTCCCAG ACCAGACACT TGACAATGGT CATGGCTGTC CTCTCAAGCC 
9901 GGGAGCGCCA GCGCGTGCTG GGCCAGCATC ACCCGGCACA TCTCGCTGCT GCCCTCGATG 
9961 ATCTCCATGA GCTTGGCGTC GCGGTACGCC CGTTCGACGA CGTGTCCCTC TCTCGCGCCT 
10021 GCCGACGCGA GCACCTGTGC GGCGGTCGCG GCCCCGGCGG CGGCTCGTTC GGCGGCGACG 
40 10081 TGCTTGGCCA GGATCGTCGC GGGCACCATC TCGGGCGAGC CCTCGTCCCA GTGGTCGCTG 
10141 GCGTACTCGC ACACGCGGGC CGCGATCTGC TCCGCGGTCC ACAGGTCGGC GATGTGCCCG 
10201 GCGACGAGTT GGTGGTCGCC GAGCGGCCGG CCGAACTGCT CCCGGGTCCG GGCGTGGGCC 
10261 ACCGCGGCGG TGCGGCAGGC CCGCAGGATC CCGACGCAGC CCCAGGCGAC CGACTTGCGC 
10321 CCGTAGGCGA GTGACGCCGC GACCAGCATC GGCAGTGACG CGCCGGAGCC GGCCAGGACC 
45 10381 GCGCCGGCCG GCACACGCAC CTGGTCCAGG TGCAGATCGG CGTGGCCGGC GGCGCGGCAG 
104 41 CCGGACGGCT TCGGGACGCG CTCGACGCGT ACGCCGGGGG TGTCGGCGGG CACGACCACC 
10501 ACCGCACCGG AACCATCCTC CTGGAGACCG AAGACGACCA GGTGGTCCGC GTAGGCGGCG 
10561 GCAGTCGTCC AGACCTTGTG GCCGTCGACG ACAGCGGTGT CCCCGTCGAG CCGAACCCGC 
10621 GTCCGCATCG CCGACAGATC GCTGCCCGCC TGCCGCTCAC TGAAGCCGAC GGCCGCGAGT 
50 10681 TTCCCGCTGG TCAGCTCCTT CAGGAAGGTC GCCCGCTGAC CGGCGTCGCC GAGCCGCTGC 
10741 ACGGTCCACG CGGCCATGCC CTGCGACGTC ATGACACTGC GCAGCGAACT GCAGAGGCTG 
10801 CCGACGTGTG CGGTGAACTC GCCGTTCTCC CGGCTGCCGA GTCCCAGACC GCCGTGCTCG 
108 61 GCCGCCACTT CCGCGCAGAG CAGGCCGTCG GCGCCGAGCC GGACGAGCAG GTCGCGCGGC 
10921 AGTTCGCCGG ACGTGTCCCA CTCGGCGGCC CGGTCACCGA CAAGGTCGGT CAGCAGCGCG 
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10981 TCACGCTCAG GCATCGACGG CCCGCAGCCG GTGGACGAGT GCGACCATGG ACTCGACGGT 
11041 ACGGAAGTTC GCGAGCTGGA GGTCCGGGCC GGCGATCGTG ACGTCGAACG TCTTCTCCAG 
11101 GTACACGACC AGTTCCATCG CGAACAGCGA CGTGAGGCCG CCCTCCGCGA ACAGGTCGCG 
11161 GTCCACGGGC CAGTCCGACC TGGTCTTCGT CTTGAGGAAC GCGACCAACG CGTGCGCGAC 
11221 GGGGTCGTCC TTGACGGGTG CGGTCATGAG AACACCTTCT CGTATTCGTA GAAGCCCCGG 
11281 CCGGTCTTCC GGCCGTGGTG TCCCTCGCGG ACCTTGCCCA GCAGCAGGTC ACAGGGGCGG 
11341 CTGCGCTCGT CGCCGGTGCG TTTGTGCAGC ACCCACAGCG CGTCGACGAG GTTGTCGATG 
11401 CCGATCAGGT CCGCGGTGCG CAGCGGCCCG GTCGGATGGC CGAGGCACCC CGTCATGAGC 
114 61 GCGTCGACGT CCTCGACGGA CGCGGTGCCC TCCTGCACGA TCCGCGCCGC GTCGTTGATC 
11521 ATCGGGTGGA GCAGCCGGCT CGTGACGAAG CCGGGCGCGT CCCGGACGAC GATCGGCTTG 
11581 CGCCGCAGCG CCGCGAGCAG GTCCCCGGCG GCGGCCATGG CCTTCTCACC GGTCCGGGGT 
11641 CCGCGGATCA CCTCGACCGT CGGGATCAGG TACGACGGGT TCATGAAGTG CGTGCCGAGC 
11701 AGGTCCTCGG GCCGGGCCAC GGAGTCGGCC AGTTCGTCAA CCGGGATCGA CGACGTGTTC 
117 61 GTGATGACCG GGATACCGGG CGCCGCTGCC GAGACCGTGG CGAGTACCTC CGCCTTGACC 
11821 TCGGCGTGCT CGACGACGGC CTCGATCACC GCGGTGGCCG TACCGATCGC GGGCAGCGCG 
11881 GACGTGGCCG TCCGCAGCAC ACCGGGGTCG GCCTCGGCGG GCCCGGCCAC GAGTTGTGCC 
11941 GTCCGCAGTT CGGTGGCGAT CCGCGCCCGC GCCGCCGTAA GGATCTCCTC GGACGTGTCG 
12001 ACGAGTGTCA CCGGGACGCC GTGGCGCAGC GCGAGCGTGG TGATGCCGGT GCCCATCACT 
12061 CCCGCGCCGA GCACGATCAG CTGGTGGTCC ACGCTGTTTC CTCCCTCCGG GGTCACCATG 
12121 GCAGCGAGTA CGGGTCGAGG ACGTCTTCCG GGGTCGACCC GATCGCGTCC TTGCGGCCGA 
12181 GGCCGAGTTC GTCGGCGAAG CCGAGCAGCA CGTCGAACGC GATGTGGTCG GCGAACGCGC 
12241 TGCCCGTCGA GTCGAGGACG CTCAGGCTGT CCCGGTGGTC CGCCGCGGTG TCCGGTGCCG 
12301 CGCACAGGGC CGCCAGCGAC GGGCCGAGCT CGCGGTCCGG CAGTTGCTGG TACTCGCCCT 
12361 CGGCGCGGGC CTGCCCCGGA TGGTCGACGC AGATGAACGC GTCGTCGAGC AGGGTCTTCG 
12421 GCAGTTCGGT CTTGCCCGGC TCGTCGGCGC CGATGGCGTT CACATGCAGG TGCGGCAGCC 
12481 GCGGCTCGGC GGGCAGCACC GGCCCTTTGC CCGAGGGCAC CGAGGTGACG GTGGACAGGA 
12541 CATCCGCGGC GGCGGCGGCC TCCGCCGGAT CGGTCACCTT GACCGGCAGT CCGAGGAACG 
12601 CGATGCGGTC CGCGAACGAC GCCGCGTGGC CGGGGTCGGT GTCGCTGACC AGGATCCGCT 
12661 CGATGGGCAG GACCCTGCTG AGCGCGTGCG CCTGGGTCAC CGCCTGTGCG CCCGCGCCGA 
12721 TCAGCGTGAG CGTGGCGCTG TCGGACCGGG CCAGCAGCCG GCTCGCGACG GCGGCGACCG 
12781 CGCCGGTCCG CATCGCGGTG ATCACGCCTG CGTCGGCGAG GGCGGTCAGA CTGCCGCTGT 
128 41 CGTCGTCGAG GCGCGACATC GTGCCGACGA TCGTCGGCAG CCGGAAGCGC GGATAGTTGT 
12901 GCGGACTGTA CGAAACCGTC TTCATGGTCA CGCCGACACC GGGGACCCGG TACGGCATGA 
12961 ACTCGATGAC GCCGGGAATG TCGCCGCCGC GGACGAATCC GGTACGCGGC GGCGCCTCGG 
13021 CGAACTCGCC GCGGCCGAGC GCGGCGAACC CGTCGTGCAG CTCGCTGATC AGCCGGTCCA 
13081 TCATCACGTC GCGGCCGATC ACGGAGAGAA TCCGCTTGAT GTCACGTTGG CGCAGGACCC 
13141 TGGTCTGCAT GTGTCACCTC CCTTTCGTGG CCGGAGCTGT CTTGGTGGTG CCGCTCGGGG 
13201 CGGCTTCCGT TCTCATCGCA GCTCCCTGTC GATGAGGTCG AAAATCTCGT CCGCGGTCGC 
132 61 GTCCGCGGAC AGCACGCCGG CCGGCGTGGT CGGGCGGGTC TCCCGCCGCC AGCGGTTGAG 
13321 CAGGGCGTCC AGCCGGGTTC CGATCGCGTC CGCCTGGCGG GCGCCCGGGT CGACACCGGC 
13381 AACGAGTGCT TCCAGCCGGT CGAGCTGCGC GAGCACCACG GTCACCGGGT CGTCCGGGGA 
134 41 CAGCAGTTCA CCGATGCGGT CGGCGAGTGC GCGCGGCGAC GGGTAGTCGA AGACGAGCGT 
13501 GGCGGACAGT CGCAGACCGG TCGCCTCGTT GAGGCCGTTG CGCAGCTGCA CCGCGATGAG 
13561 CGAGTCCACA CCGAGTTCCC GGAACGCCGC GTCCTCCGGG ATGTCCTCCG GGTCGGCGTG 
13621 GCCCAGGACG GCCGCTGCCT TCTGCCGGAC GAGGGCGAGC AGGTCGGTGG GGCGTTCCTG 
13681 CTCGTTGCGG GCGCTCCGGC GGGCCGACGG CTTGGGCCGG CCACGCAGCA GCGGGAGGTC 
13741 CGGCGGCAGG TCGCCCGCCA CGGCGACGAC ACTGCCCGTT CCGGTGTGGA CGGCGGCGTC 
13801 GTACATGCGC ATGCCCTGTT CGGCGGTGAG CGCGCTCGCC CCACCCTTGC GCATACGGCG 
138 61 CCGGTCGGCG TCGGTCAGGT CCGCGGTCAG GCCACTCGCC TGGTCCCACA GCCCCCACGC 
13921 GATCGACAGC CCTGGCAGCC CTTGTGCACG CCGGTGTTCG GCGAGCGCGT CGAGGAACGC 
13981 GTTCGCCGCC GCGTAGTTGC CCTGACCGGG GGTGCCCAGC ACACCGGCCG CCGACGAGTA 
14 041 GACGACGAAT GCGGCGAGGT CGGTGTCGCG GGTGAGCCGG TGCAGGTGCC AGGCGGCGTC 
14101 GGCCTTGGGT TTGAGGACGG TGTCGATGCG GTCGGGGGTG AGGTTGTCGA GCAGGGCGTC 
14161 GTCGAGGGTT CCGGCGGTGT GGAAGACGGC GGTGAGGGGT TGAGGGATGT GGGCGAGGGT 
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14221 GGTGGCGAGT TGGTGGGGGT CGCCGACGTC 

14281 GGGGGG^GGG GTGCGGGAGA GGAGGTAGGT 

14341 GCCGGCGAGG GTGCCGGAGC CGCCGGTGAT 

14 401 CGGGACCGTG AGGACGATCT TGCCGGTGTG 

144 61 GCGGACCTGC CGCATGTCGT GCACCGTCAC 

14521 CAGGCCGAGC AGCTCCGCGA TGATCTCCTT 

14 581 GAACGGTCGC TGGACGGCGT GCCGGATGTC 

14 641 CGGCGCGAGC AGGCCGACGG ACGCGTCGAG 

14 701 GTCGACCGGC GGGAACGCGT CGGCGAACGC 

14 7 61 GTCCAGGTCC ACCAGATGGC GCTTCGCGGC 

14 821 GTGCCGCGCG ATCTGCCGGG CGGCGGAACC 

14881 CTTCTCGCCG GGGCGCAGCC CGGCGAGGTC 

14 941 GGTCATCACG GACGCCGCCT GCGGGAACGT 

15001 GTGGTCGGCG ATGACCGTGG GGCCGAAGCC 

15061 CGGTGCCAGA CCGGAGACGT CGGCGCCGGT 

15121 GAGCACGCCC TGACCGGGGT AGGTGCCGAG 

15181 CGCCGCACGC ACACCGATCC GGACCTCGGC 

15241 GTCGGCCGCG GTGAGGCCGT CGAGGGTGCC 

15301 GCTGTCCGGC ACGGTGAGCG GCTCCGGCAC 

15361 GCCGCGCAGC CGCAGACGCG GCTCGCCGAG 

15421 GAGCGTGACG CCGGACTCGG TCTCGACGTG 

154 81 GGCGCGCAGC AGTCCGGCCG CCGCGCCGGT 

15541 ATCCCCGCCG GAGCCGGTCA GGGCGGTCAG 

15601 CACCGGGTCG TCGCCATCAG CGGCAGGCAA 

15661 ATCCGTGGGT GCGGCGACCT CGATCCAGGT 

15721 GGACAGCGGG CGGGTGCGGA CCGTCCGGAT 

15781 GACGCGCAGA CTCAGCTCGT CGCCGTCACG 

15841 CGTGGCGACG AACCGGGCCC CCTTCCAGGC 

15901 CGTGGTGAGG GCGACGGCGT GCAGGGCCGC 

15961 GTCCGCCTCG GCGGCCTGCT CGTCGGGCAG 

16021 ACGCCAGGCA GCCCGCAACC CCTGGAACGC 

16081 TTCGTCATAG AACCCCGAGA CGTCGACGGC 

16141 CGGCTCCACA CCGACAACAC CGGGGGTGTC 

16201 GTGCCGGGTC CAGCTGCCCG TGCCCTCGGT 

16261 GGCCTCATCA GCCCCTTCCA CGGTCACCGA 

16321 AAGGGGGGAT TCGATGACCA GCTCGTCCAC 

16381 GATGACCAGC TCCACAAACG CCGTACCCGG 

16441 AGCCAGCCAG GGGTGAGTGC GCAATGAGAT 

16501 GGCGGGCAGC GCTGTGACAG CGGCCAGCAT 

16561 C G AC AG AT CG GTGGCACCGG CCGCCTCCAG 

16621 GGGCAGATCC AGCAGCCGTC CCGGCACCGG 

16681 GCCCAGGGTC CACGCCTGCG CCAACGCCGT 

16741 CCGCAACGAC GCCACCGTGT GAGCCTGCTC 

16801 GCACTCCACG AACACCGACC CATCCAGCTC 

168 61 ACGCAGATTC CGGTACCAGT ACCCCTCATC 

16921 GGTCGACCAC CACGCCACCG ACGCGGCCTT 

16981 TTCATCCTCG ATGGCTTCCA CGTGGGGCGT 

17041 CACCCGCACG CCTTCGGCCT CATACCGCGC 

17101 CGCCACCACC GTCGAAGCCG GGCCGTTACG 

17161 GACCTCACCG GCCGGCAACG CCACCGAAGC 

17221 GATGACCTGA CTGCGCAATG CCACCACGCG 

17281 CACGCACGCC GCCGCGATCT CGCCCTGGGA 

17341 ATGCGCCTGC CACAGCGCGG CCAGGCTCAC 

17401 CTCCACCCGC TCCGCCACAT CCGGCCGCGC 



GCAGGGGAGG TGGGTGCCGG GGGTGGTGTC 
GTGGGGGTGG TTCAGGTGGC GGGCGAGGAT 
GACGACGGCC CCCTCGGGGT CCAGCGGCCG 
CTCGCCGCGG CTCATGGTCG CCAGCGCCTC 
CGGCAGCGGG TGCAGCACAC CGCGCGCGAA 
GAGCCGGTCG GGCCCCGCGT CCATCAGGTC 
CGTCTTCCCC ATCTCGATGA ACCGGCCACC 
GAGTTCACCG GTGAGCGAGT TGAGCACGAC 
GGTGCTGCGG GAATCGGCCA GATGCGCTCC 
GCTGGTGGTC GCGTACACCT CCGCGCCCAG 
GACACCGCCG GTGGCCGCGT GGATCAGGAC 
GACCAGGCCG TACCACGCGG TCGCGAACGC 
CCAGCCGTCC GGCATCCGGC CGAGCATCCG 
GGTGCCGACG AGGCCGAAGA CGCGGTCGCC 
CTCCAGGACG ATGCCCGCGG CCTCGCCGCC 
CGCGATCAGC ACATCGCGGA AGTTGAGGCC 
CGGGGCGAGG GGGCGCCGGG GCTCCGCCGA 
CGTCCGCGCC GGCCGGATCA GCCACGTGTC 
CCGGGTGAGG CGGGCCGCCT CGAACCGGCC 
TGCGACGGCG ATGCGCTGCT GCUCGGGGGC 
GACGAACCGG CCGGGCTGCT CGGCCTGGGC 
GGCGAGGCCC GCGGTGGTGT GCACGAGCAG 
CAGCCGGGTG GTGAGCGCAC GCGTCTCGGC 
CGTGATGACG TCCACGTCGG TCGCGGGGAC 
GAGACGCATC AGGCCGGTGC CGACGGGTGG 
CTCGGCGACG AGTTGGCCGG CGGAGTCGGC 
AGTGATCACG GCTCGGAGCA TGGCCGAGCC 
GAACGGCAGA CCCGCAGCGC TGTCGTCCGG 
GTCGAGCAGC GCCGGATGCA CACCGAAACC 
CGCCACCTCG GCATACACGG TGTCACCATC 
CGACCCGTAC TCATAACCGG CATCCCGCAG 
CACGGCCGTG ACCGGCGGCC ACTGCGAGAA 
GGGGGTGTCG GGGGTCAGGG TGCCGCTGGC 
ACGCGCGTGG ACGGTCACCG GCCGCCGTCC 
CACATCCACC GCTGCGGTCA CCGGCACCAC 
TATCCCGCAA CCGGTCTCGT CACCGGCCCG 
CAGCAGGACC GTGCCCCGCA CCGCGTGATC 
CCGGCCAGTG AGAACAACAC CACCATCGTC 
CGGATGCGCC GCACCCGTCA ACCCCGCCGC 
CCAGTACCGC CTGTGCTCGA ACGCGTACGT 
TTCGACCACC GTGTCCCAGT CCACTGCCGT 
CAGCCACCGC TCCCAGCCGC CGTCACCGGT 
CATCGCCGGC AGCAGCACCG GATGGGCACT 
CGCCACCGCC GCGTCCAACG CCACCGGACG 
CACCGGCTCC GTCACCCAGG CGCTGTCCAC 
CCCTGCCACC CCCTCCAGTA CCTTGGCCAG 
GTGGGAGGCG TAGTCGACCG CGATACGACG 
CACCACCTCC TCCACCGCCG ACGGGTCCCC 
CGCCGCGATC CACACACCCT CGACCAGACC 
CATCGCTCCC CGCCCGGCCA GTCGCGCCGC 
GGCGGCGTCC TCGAGGCTGA GGGCTCCGGC 
GTGTCCGATC ACCGCGTCCG GCACGACCCC 
CGCGACCGCC CAGCTGGCCG GCTGGACCAC 
CAACATCTCC CGCACATCCC AGCCCGTGTG 
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174 61 CGGCAGCAAC GCCTGAGCGC ACTCCTCCAT ACGCGCGGCG AACACCGCGG AGTGGGCCAT 
17521 GAGTTCCACG CCCATGCCGA CCCACTGGGC GCCCTGGCCG GGGAAGACGA ACACCGTACG 
17581 CGGCTGGTCC ACCGCCACAC CCGTCACCCG GGCATCGCCC AGCAGCACCG CACGGTGACC 
17 641 GAAGACAGCA CGCTCCCGCA CCAACCCCTG CGCGACCGCG GCCACATCCA CACCACCCCC 
5 17701 GCGCAGATAC CCCTCCAGCC GCTCCACCTG CCCCCGCAGA CTCACCTCAC CACGAGCCGA 
177 61 CACCGGCAAC GGCACCAACC CGTCAACAAC CGACTCCCCA CGCGACGGCC CAGGAACACC 
17 821 CTCAAGGATC ACGTGCGCGT TCGTACCGCT CACCCCGAAC GACGACACAC CCGCATGCGG 
17 881 TGCCCGATCC GACTCGGGCC ACGGCCTCGC CTCGGTGAGC AGCTCCACCG CACCGGCCGA 

17 941 CCAGTCCACA TGCGACGACG GCTCGTCCAC ATGCAGCGTC TTCGGCGCGA TCCCGTACCG 
10 18001 CATCGCCATG ACCATCTTGA TCACACCGGC GACACCCGCC GCCGCCTGCG CATGACCGAT 

18061 GTTCGACTTC AACGAACCCA GCAGCAGCGG AACCTCACGC TCCTGCCCGT ACGTCGCCAG 
18121 AATGGCCTGC GCCTCGATGG GATCGCCCAG CGTCGTCCCC GTCCCGTGCG CCTCCACCAC 
18181 GTCCACATCG GCGGCGCGCA GTCCGGCGTT CACCAACGCC TGCTGGATGA CACGCTGCTG 
18241 GGACGGGCCG TTGGGGGCGG ACAGCCCGTT GGAGGCACCG TCCTGGTTCA CCGCCGACCC 
15 18301 GCGGACGACC GCGAGAACGG TGTGTCCGTT GCGCTCGGCG TCGGAGAGCC GCTCCAGCAC 
18361 AAGAACGCCG GCGCCCUCCG CCCAGCCGGT GCCGTTGGCG GCGTCCGCGA ACGCGCGGCA 

18 421 GCGGCCGTCG GGGGAGAGTC CGCCCTGCTG CTGGAATTCC ACGAACCCGG TCGGGGTCGC 
184 81 CATGACGGTG ACACCGCCGA CCAGCGCCAG CGAGCACTCC CCGTGGCGCA GTGCGTGCCC 
18541 GGCCTGGTGC AGCGCGACCA GCGACGACGA GCACGCCGTG TCCACCGTGA ACGCCGGTCC 

20 18 601 CTGGAGCCCA TAGAAGTACG AGATCCGGCC GGTGAGCACG CTGGGCTGCA TGCCGATCGA 
18 661 GCCGAACCCG TCCAGGTCCG CGCCGACGCC GTACCCGTAC GAGAAGGCGC CCATGAACAC 
18721 GCCGGTGTCG CTGCCGCGCA GTGTGCCCGG CACGATGCCC GCGCTCTCGA ACGCCTCCCA 
18781 TGTCGTTTCC AGCAGGATCC GCTGCTGGGG GTCCATGGCC CGTGCCTCAC GGGGGCTGAT 
18841 GCCGAAGAAC GCGGCATCGA AGCCGGCGGC GTCGGAGAGG AAGCCGCCGC GGTCCGTGTC 

25 18901 CGATCCGCCG GTGAGGCCGG ACGGGTCCCA GCCACGGTCG GCCGGGAAGC CGGTGACCGC 
18961 GTCGCCGCCA CTGTCCACCA TGCGCCACAG GTCGTCGGGC GAGGTGACGC CGCCCGGCAG 
19021 TCGGCAGGCC ATGCCCACGA TGGCCAGCGG TTCGTCACGG GTCGCGGCGG CTGTGGGAAC 
19081 AGCGACCGGT GCGGCACCAC CGACCAGAGC CTCGTCCAAC CGCGACGCGA TGGCCCGCGG 
19141 CGTCGGGTAG TCGAAGACAA GCGTGGCGGG CAGTCGGACA CCGGTCGCCG CGGCGAGTCG 

30 19201 GTTCCGCAGT TCGACGGCGG TCAGCGAGTC GATACCCAGT TCCTTGAAGG CCGCGTCCGC 
19261 GGACACGTCC GCGGCGTCCG CGTGGCCGAG CACCGCCGCC GCGTTGTCGC GGACCAGTGC 
19321 CAGCAGCGCG GTGTCCCGCT CAGCGCCGGA CATGGTGCCG AGCCGGTCGG CGAGCGGAAC 
19381 GGCGGTGGCC GCCGCCGGGC GCGATACGGC GCGGCGCAGA TCGGCGAAAA GCGGCGATGT 
19441 GTGCGCGGTG AGGTCCATCG TGGCCGCCAC GGCGAACGCG GTGCCGGTTC CGGCCGCGGC 

35 19501 TTCCAGCAGG CGCATGCCCA CACCGGCCGA CATGGGGCGG AAACCGCCGC GGCGGACACG 
19561 GGTGCGGTTG GTGCCGCTCA TGCTGCCGGT GAGTCCGCTG TCATCGGCCC AGAGGCCCCA 
19621 GGCCAGCGAC AGCGCGGGCA GTCCTTCGGC ATGGCGCAGC GTCGCGAGTC CGTCGAGGAA 
19681 CCCGTTCGCC GCCGAGTAGT TGCCCTGGCC GCGGCCGCCC ATGATGCCCG CGACGGACGA 
19741 GTAGAGGACG AACGAGCGCA GGTCCGCGTC CCGGGTCAGC TCGTGCAGGT GCCAGGCGCC 

40 19801 GTCGGCTTTG GGGCGCAGTG TGGTGGCGAG CCGCTCCGGG GTGAGTGCCG TGGTCACGCC 
19861 GTCGTCGAGC ACGGCTGCCG TGTGGAAGAC CGCCGTGAGC GGCCTGCCGG CGGCGGCGAG 
19921 CGCGGCGGCG AGCTGGTCCC GGTCGGCGAC GTCACAGCGG ATGTGGACAC CGGGAGTGTC 
19981 CGCCGGCGGT TCGCTGCGCG ACAGCAACAG GAGGTGGCGG GCGCCATGCT CGGCGACGAG 
20041 ATGCCGGGCG AGGAGACCTG CCAGCACACC CGAGCCGCCG GTGATGACCA CCGTGCCGTC 

45 20101 CGGGTCGAGC AGCGGTTCGG GCGTTTCCGC GGCGGCCGTG CGGGTGAACC GCGGCGCTTC 
20161 GTACCGGCCG TCGGTGACGC GGACGTACGG CTCGGCCAGT GTCGTGGCGG CGGCCAGCGC 
20221 CTCGATGGGG GTGTCGGTGC CGGTCTCCAC CAGCACGAAC CGGCCCGGGT GCTCGGCCTG 
20281 GGCGGACCGG ACGAGGCCGG CGACCGCTCC TCCGACCGGT CCCGCGTCGA TCCGGACGAC 
20341 GAGGGTGGTC TCCGCAGGGC CGTCCTCGGC GATCACCCGG TGCAGCTCGC CGAGCACGAA 

50 20401 CTCGGTGAGC CGGTACGTCT CGTCGAGGAC ATCCGCGCCC GGTTCCGGGA GCGCGGAGAC 
204 61 GATGTGGACC GCGTCCGCAG GACCGGGCCC GGGAGTGGGC AGCTCGGTCC AGGAGAGGCC 
20521 GTACAAGGAG TTCCGTACGA CGGCGGCGTC GCCGTCGACG TTCACCGGTC GCGCGGTCAG 
20581 CGCGGCGACG GTCACCACCG GTTGGCCGAC CGGGTCCGTC GCATGCACGG CAGCGCCGTC 
20641 CGGGCCCTGA GTGATCGTGA CGCGCAGCGT GGTGGCCCCG GTCGTGTGGA ACCGCACGCC 
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20701 GCTCCACGAG AACGGCAGCC GCACCTCCGC TTCCTGTTCC GCGAGCAGCG GCAGGCAGGT 
2 0761 GACGTGCAAG GCCGCGTCGA ACAGCGCCGG GTGGACGCCA TAGTGCGGCG TGTCGTCCGC 
20821 CTGTTCCCCG GCGATCTCCA CCTCGGCGTA CAGGGTTTCG CCGTCGCGCC AGGCGGTGCG 
20881 CAGTCCCTGG AACGCTGGGC CGTAGCTGTA GCCGGTCTCG GCCAGCCGCT CGTAGAACGC 
5 20941 GCTCACGTCG ACGCGTCGCG CGCCCGGCGG CGGCCACGCG GGCGGCGGGA CCGCCGCGAC 
21001 GCTTCCGGCC CGGCCGAGGG TGCCGCTGGC GTGCCGGGTC CAGCTGTCCG TGCCCTCGGT 
21061 ACGCGCGTGG ACGGTCACTC GCCGCCGTCC GGCCTCATCG GCCCCTTCGA CGGTCACCGA 
21121 CACATCCACC GCGCCGGTCA CCGGCACCAC GAGCGGGGTC TCGATGACCA GTTCATCCAC 
21181 CACCCCGCAA CCGGTCTCGT CACCGGCCCG GATGACCAGC TCCACAAACG CCGTACCCGG 
10 21241 CAGCAGAACC GTGCCCCGCA CCGCGTGATC AGCCAGCCAG GGATGCGTAC GCAACGAGAT 
21301 CCGGCCAGTG AGAACAACAC CACCACCGTC GTCGGCGGGC AGTGCTGTGA CGGCGGCCAG 
21361 CATCGGATGC GCCGCCCCGG TCAGCCCGGC CGCGGACAGA TCGGTGGCAC CGGCCGCCTC 
21421 CAGCCAGTAC CGCCTGTGCT CGAACGCGTA GGTGGGCAGA TCGAGCAGCC GTCCCGGCAC 
21481 CGGTTCGACC ACCGTGTCCC AGTCCACTGC CGTGCCCAGG GTCCACGCCT GCGCCAACGC 
15 21541 CGTCAGCCAC CGCTCCCAGC CGCCGTCACC GGTCCGCAAC GACGCCACCG TGTGAGCCTG 
21601 TTCCATCGCC GGCAGCAGCA CCGGATGGGC GCTGCACTCC ACGAACACGG ACCCGTCCAG 
21661 CTCCGCCACC GCCGCGTCCA GCGCGACGGG GCGACGCAGG TTCCGGTACC AGTAGCCCTC 
21721 ATCCACCGGC TCGGTCACCC AGGCGCTGTC CACCGTGGAC CACCAGGCCA CCGACCCGGT 
k! 217 81 CCCGCCGGAA ATCCCCTCCA GTACCTCGGC CAACTCGTCC TCGATGGCTT CCACGTGGGG 

=11 20 21841 CGTGTGGGAG GCGTAGTCGA CCGCGATACG GCGCACTCGC ACGCCTTCGG CCTCGTACCG 
-|j 21901 CGTCACCACT TCTTCCACCG CGGACGGGTC CCCCGCCACC ACAGTCGAAG ACGGGCCGTT 

£ 21961 ACGCGCCGCG ATCCACACGC CCTCGACCAG GTCCACCTCA CCGGCCGGCA ACGCCACCGA 

Q 22021 AGCCATCGCC CCCCGCCCGG CCAGCCGCCC GGCGATCACC TGGCTGCGCA AGGCCACCAC 

m 22081 GCGGGCGGCG TCCTCAAGGC TGAGGGCTCC GGCCACACAC GCCGCCGCGA TCTCGCCCTG 

2 25 22141 GGAGTGTCCG ACCACCGCGT CCGGCACGAC CCCATGCGCC TGCCACAGCG CGGCCAGGCT 
1^ 22201 CACCGCGACC GCCCAGCTGG CCGGCTGGAC CACCTCCACC CGCTCCGCCA CATCCGGCCG 

22261 CGCCAACATC TCCCGCACAT CCCAGCCCGT GTGCGGCAAC AACGCCCGCG CACACTCCTC 
L 22321 CATACGAGCC GCGAACACCG CAGAACACGC CATCAACTCC ACACCCATGC CCACCCACTG 

|? 22381 AGCACCCTGC CCGGGAAAGA CGAACACCGT ACGCGGCTGA TCCACCGCCA CACCCATCAC 

UO 30 224 41 CCGGGCATCG CCCAACAACA CCGCACGGTG ACCGAAGACA GCACGCTCAC GCACCAACCC 
rij 22501 CTGCGCGACC GCGGCCACAT CCACACCACC CCCGCGCAGA TACCCCTCCA GCCGCTCCAC 

SJ 22561 CTGCCCCCGC AGACTCACCT CACTCCGAGC CGACACCGGC AACGGCACCA ACCCATCGAC 

p 22 621 AGCCGACTCC CCACGCGACG GCCCGGGAAC ACCCTCAAGG ATCACGTGCG CGTTCGTACC 

U 22681 GCTCACCCCG AAAGCGGAGA CACCGGCCCG GCGCGGACGT CCCGCGTCGG GCCACGCCCG 

35 22741 CGCCTCGGTG AGCAGTTCCA CCGCGCCCTC GGTCCAGTCC ACATGCGACG ACGGCTCGTC 
22801 CACATGCAGC GTCTTCGGCG CGATGCCATA CCGCATCGCC ATGACCATCT TGATGACACC 
228 61 GGCGACACCC GCAGCCGCCT GCGCATGACC GATGTTCGAC TTCAACGAAC CCAGCAGCAG 
22 921 CGGAACCTCA CGCTCCTGCC CGTACGTCGC CAGAATCGCG TGCGCCTCGA TGGGATCGCC 
22981 CAGCGTCGTC CCCGTCCCGT GCGCCTCCAC CACGTCCACG TCGGCGGGGG CGAGCCCCGC 
40 23041 CTTGTGGAGG GCCTGGCGGA TGACGCGCTG CTGGGAGGGG CCGTTGGGTG CGGAGATGCC 
23101 GTTGGAGGCG CCGTCCTGGT TGACGGCGGA GGAGCGGACG ACCGCGAGGA CGGTGTGTCC 
23161 GTTGCGCTCG GCGTCGGAGA GCTTTTCGAC GACGAGGACG CCGGCCCCCT CGGCGAAACC 
23221 GGTGCCGTCC GCCGCGTCAG CGAACGCCTT GCACCGTCCG TCCGGCGCGA CGCCGCCCTG 
23281 CCGGGAGAAC TCCACGAAGG TCTGTGGTGA TGCCATCACT GTGACACCAC CGACCAGCGC 
45 23341 CAGCGAGCAC TCCCCGGTCC GCAGCGCCTG CCCGGCCTGG TGCAGCGCGA CCAGCGACGA 
234 01 CGAACACGCC GTGTCGACCG TGACCGCCGG ACCCTCCATG CCGAAGAAGT ACGACAGCCG 
23461 TCCGGCGAGC ACCGCGGGCT GTGTGCTGTA GGCGCCGAAT CCGCCCAGGT CCGCGCCCGT 
23521 GCCGTAGCCG TAGTAGAAGC CGCCGACGAA GACGCCGGTG TCGCTGCCGC GCAGGGTGTC 
23581 C GG C AC GAT G CCGGCGTGTT CGAGCGCCTC CCAGGCGATT TCGAGGAGGA TCCGCTGCTG 
50 23641 CGGGTCGAGT GCGGTGGCCT CGCGCGGACT GATGCCGAAG AACGCGGCAT CGAAGTCGGC 
23701 GGCGCCCGCG AGTGCGCCGG CCCGCCCGGT GGCGGACTCG GCGGCGGCGT GCAGCGCGGC 
237 61 CACGTCCCAG CCGCGGTCGG TGGGGAAGTC GCCGATCGCG TCGCGGCCGT CCGCGACGAG 
23821 CTGCCACAGC TCTTCCGGTG AGGTGACGCC GCCCGGCAGT CGGCAGGCCA TGCCGACGAC 
23881 GGCGAGCGGC TCGTTCGCCG CGGCGCGCAG CGCGGTGTTC TCCCGGCGGA GCTGCGCGTT 
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23941 GTCCTTGACC GACGTCCGCA GCGCCTCGAT CAGGTCGTTC TCGGCCATCG CCTCATCCCT 
24001 TCAGCACGTG CGCGATGAGC GCGTCTGCGT CCATGTCGTC GAACAGTTCG TCGTCCGGCT 
24 061 CCGCGGTCGT GGTGCTCGCG GGTGCCTGTG CCGGTGGTTC ACCGCCGTCC GGGGTCCCGT 
24121 TGTCGTCCGG GGTCCCGTTG ACGTCCGGGG CCAGGAGGGT CAGCAGATGA CGGGTGAGCG 
24181 CGCCGGCGGC GGGATAGTCG AAGACGAGCG TGGCCGGCAG CGGAATGCCG AGGGCCTCGG 
24241 AGAGCCGGTT GCGCAGGCCG AGCGCGGTGA GCGAGTCGAC CCCGAGGTCC TTGAACGCCG 
24 301 TGGTGGCCGT GACCGCCGCC GCGTCGGTGT GGCCCAGCAG GGTGGCGGCG GTGTCGCGGA 
24361 CGACGCCGAG CAGCACCTGT TCCCGTTCCT TGTGGGGCAG GTCCGGCAGG CGTTCCAGCA 
24421 GGGAGCCGCC GTCGGTCGCG GAGCGCCGGG TGGGGCGCTG GATCGGTCGC CACAGCGGTG 
24 4 81 ACGGGTCGCC GGGCCCGGGT GGGGCGGTCG CCACGACCAC GGCTTCCCCG GTGGCGCACG 
24541 CGGCGTCGAG GAGGTCGGTC AGCCGGTCCG CCGCGGCGGT GAACGCCACG GCCGGCAGGC 
24 601 CTTGTGCCCG GCGCAGGTCG GCCAGGGCCT GGAGCGGTCC GGCCGCCTCG CCGGACGGAA 
24 661 CGGCGAGAAC GAACGCGGTC AGGTCGAGGT CGCGGGTCAG GCGGTGCAGT TCCCAGGCCG 
24 721 ACTCGGCGGT GCCGTCCGCG TGGACGACCG CGGTCACCGG GGTTTCCGGC ACTGTGCCCG 
15 24 781 GCTCGTACCG GATCACTTCG GCGCCGTGTC CGCCGAGGTG TCCGGCGAGT TCCTCCGAAC 
24 841 CGCCCGCGAG GAGGACGGTG TCGCCGTACG AGGCCGCGGC CGTGGTGGGC GCGGCGGGGA 
24 901 CGAGGCGGGG CGCTTCGAGG CGCCCGTCGG CCAGGCGCAG GTGCGGTTCG TCGAGGCGGG 
24 961 AGAGGGCGGC GGCGCGGCGG GGGGTGACCG TGTCGGTGGT CTCCACGAGC ACGAGCCGGC 
25021 CCGGTTCCGC GGTGTCGAGC AGTGCGGCGA CGGCACCGGC GACGGGCCCG GCCTCGGCGG 
25081 ACACCACCAG CGTGGCGCCG GCGGTCCTCG GGTCGTCCAG TGCGGTACGG ACCTCGTCGG 
25141 GACCGGATAC CGGGACGACG ATGACGTCGG GCGTGGCGTC GTCGCCGAGG TCGGTGTACC 
25201 GGCGGGCCGT GGTGCCGGGT GCCGCCGGGG CCCGGACGCC GGTCCAGGTG CGCCGGAACA 
25261 GCCGCACGTC CCCGTCCGGG CCCGTCGTGG CGGGGGGCCG GGTGATGAGC GAGCCGATCT 
25321 GAGCCACCGG CCGTCCCAGT TCGTCGGCGA GGTGCACGCG GGCGCCGCCC TCGCCCTCGC 
25 25381 CGTGGACGAA GGTGACGCGC AGTTTCGTGG CGCCGCTGGT GTGGACACGG ACGCCGGTGA 
254 41 ACGCGAACGG CAACCGTACC CCCGCGTTCT CGGCGGCCGC GCCGATGCTG CCCGCTTGCA 
25501 GCGCGGTGAC GAGCAGCGCC GGGTGCAGTG TGTAGCGGGC GGCGTCCCTG GCGAGGGCGC 
25561 CGTCGAGGGC GACTTCGGCG CAGACGGTGT CTCCGTGGCT CCACGCGGCG GACATGCCGC 
25621 GGAACTCGGG GCCGAACTCG TATCCCGCGT CGTCGAGTCG CTGGTAGAAG GCCGCGACGT 
25681 CGACCGGTTC CGCGTGCTCG GGCGGCCAGG GCCCCGGCGT GGTGGCCGGT TCGGTGGTGG 
iy 25741 CGATGCCGGC GAAGCCGGAG GCGTGGCGGG TCCATGTCCG GTCGCCGTCC GTCCGGGCGT 

N 25801 GGACGCGCAC GGCACGGCGT CCGGTGTCGT CGGGCGCGGC GACGGTCACG CGCACCTGGA 

Q 25861 CGGCGCCGGT GGCGGGCAGG ACCAGCGGTG TCTCGACGAC CAGTTCGTCG AGCAGGTCGC 

M= 25921 AGCCTGCCTC GTCGGCGCCG CGTCCGGCCA ATTCCAGGAA GGCGGGTCCG GGCAGCAGTA 

35 25981 CGGCGCCGTC GACGGAGTGA CCGGCCAGCC ATGGGTGGGT GGCCAGCGAG AACCGGCCGG 
26041 TGAGCAGCAC CTCGTCGGAG TCGGGGAGCG CCACCGACGC GGCGAGCAGC GGGTGGTCGA 
2 6101 CGGCGTCGAG TCCGAGGCCG GAAGCGTCCG TGCCGGCCGC GGTCTCGATC CAGTAGCGCT 
26161 CATGGTGGAA GGCGTATGTG GGCAGGTCGT GTGCCGTCGC CGTCGCGGGG ACGACCGCCG 
26221 CCCAGTCGAC GGGCACGCCG GTTGTGTGCG CCTCGGCCAG CGCGGTGAGC AGCCGGTGGA 
26281 CTCCCCCGCC GCGGCGGAGC GTGGCGACGG TCGCGCCGTC GATCGCGGGC AGCAGCACGG 
26341 GGTGCGCGCT GACCTCGACG AACACGGTGT CACCCGGCTC GCGGGCAGCG GTCACGGCCG 
264 01 TGGCGAAGCC TACGGGGTGG CGCATGTTGC GGAACCAGTA CTCGTCGTCG AGCGGCGCGT 
264 61 CGATCCAGCG TTCGTCGGCG GTGGAGAACC ACGGGATCTC GGGCGTGCGC GAGGTGGTGT 
26521 CCGCGACGAT CCGCTGGAGT TCGTCGTACA GCGGGTCGAC GAACGGGGTG TGGGTCGGGC 
45 26581 AGTCGACGGC GATGCGGCGC ACCCAGACGC CGCGGGCCTC GTAGTCGGCG ATCAGCGTTT 
26641 CGACGGCGTC CGGGCGCCCG GCGACGGTCG TGGTGGTGGC GCCGTTGCGG CCCGCGACCC 
26701 AGACGCCGTC GATCCGGGCG GCATCCGCCT CGACGTCGGC GGCCGGGAGC GCGACCGAGC 
267 61 CCATCGCGCC GCGTCCGGCG AGTTCGCGCA GGAGCAGGAG AACGCTGCGC AGCGCGACGA 
26821 GGCGGGCACC GTCCTCCAGG GTGAGCGCTC CGGCGACACA GGCCGCGGCG ATCTCGCCCT 
26881 GGGAGTGTCC GATGACGGCG TCCGGGCGTA CGCCCGCGGC CTCCCACACG GCGGCCAGCG 
26941 ACACCATGAC GGCCCAGCAG ACGGGGTGCA CGACGTCGAC GCGGCGGGTC ACCTCCGGGT 
27001 CGTCGAGCAT GGCGATGGGG TCCCAGCCCG TGTGCGGGAT CAGCGCGTCG GCGCATTGGC 
27061 GCATCCTGGC GGCGAACACC GGGGAGGCCG CCATCAGTTC GACGCCCATG CCGCGCCACT 
27121 GCGGTCCTTG TCCGGGGAAG ACGAAGACGG TGCGCGGCTC GGTGAGCGCC GTGCCGGTGA 
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27181 CGACGTCGTC GTCGAGCAGC ACGGCGCGGT GCGGGAACGT CGTACGCCTG GCGAGCAGGC 
27241 CCGCGGCGAT GGCGCGCGGG TCGTGGCCGG GACGGGCGGC GAGGTGCTCG CGGAGTCGGC 
27301 GGACCTGGCC GTCGAGGGCC GTGGCGGTCC GCGCCGAGAC GGGCAGTGGT GTGAGCGGCG 
27361 TGGCGATCAG CGGCTCACCG GGCTTCGAGG CCGACGGCTC CTCGGCCGGC GGCTCCCCGG 
5 27421 CCGGGTGGGC TTCCAGCAGG ACGTGGGCGT TGGTGCCGCT GACGCCGAAG GAGGACACAC 
27481 CGGCGCGCCG CGGGCGGTCG GTCTCGGGCC AGGGCCGGGC ATCGGTGAGG AGTTCGACGG 
27541 CGCCGGCCGT CCAGTCGACG TGCGAGGACG GCGTGTCCAC GTGCAGGGTG CGCGGCAGGG 
27 601 TGCCGTGCCG CATGGCGAGG ACCATCTTGA TGACACCGGC GACACCCGCG GCGGCCTGAG 
27661 TGTGGCCGAT GTTGGACTTC AGCGAGCCCA GCAGCACCGG GGTGTCGCGC CCCTGCCCGT 
10 27721 AGGTGGCCAG CACCGCCTGT GCCTCGATGG GATCGCCCAG CCTGGTGCCG GTGCCGTGCG 

277 81 CCTCCACGGC GTCCACGTCC GCCGGGGTGA GCCCGGCGTT GGCCAGGGCC TGCCGGATCA 

278 41 CCCGCTCCTG CGAGGGCCCG TTCGGCGCCG ACAACCCGTT GGAAGCACCG TCCTGGTTGA 
27901 CCGCCGAACC CCGGACAACC GCCAGCACAC GGTGGCCGTT GCGCTCGGCA TCGGAGAGCC 
27961 TCTCGACGAT CAGCACACCG GACCCCTCGG CGAAACCGGT GCCGTCAGCC GCATCCGCGA 

15 28021 ACGCCTTGCA GCGCGCGTCG GGCGCGAGAC CCCGCTGCTG GGAGAACTCG ACGAAGCCGG 
28081 ACGGCGAGGC CATCACCGTG ACGCCGCCGA CCAGGGCGAG CGAGCATTCG CCGGAGCGCA 
28141 GTGACTGCCC GGCCTGGTGC AGCGCCACCA GCGACGACGA ACACGCCGTG TCGACCGTGA 
^ 28201 CCGCCGGACC CTCCAGACCG TAGAAGTACG ACAGCCGACC GGACAGCACA CTGGTCTGGG 

y OA 28261 TGCCGGTCGC GCCGAAACCG CCCAGGTCGG TGCCGAGTCC GTACCCGTCG GAGAAGGCGC 
20 28321 CCATGAACAC GCCGGTGTCG CTTCCGCGCA GCGACTCCGG GAGGATCCCG GCGTGTTCCA 
ijy 28381 GCGCCTCCCA CGAGGTCTCC AGGACCAGAC GCTGCTGCGG GTCCATCGCC AGCGCCTCAC 

£ 28441 GCGGACTGAT CCCGAAGAAC GCCGCGTCGA AGTCCGCCAC CCCGGCGAGG AAGCCACCAT 

O 28501 GACGCACGGT CGACGTGCCC GGATGATCCG GATCGGGATC GTACAGCCCG TCCACGTCCC 

IK oc 28561 AACCACGGTC CGTCGGAAAC GCCGTGATCC CGTCACCACC CGACTCCAGC AGCCGCCACA 
r; 25 28621 AGTCCTCCGG CGACGCGACC CCACCCGGCA GCCGGCAGGC CATCCCCACG ATCGCCAACG 
L_ 28 681 GCTCGTCCTG CCGGACGGCC GCGGTCGTGG TGCGGGTCGG CGATGCCGTC CGGCCGGACA 

ai 28741 GCGCCGCGGT GAGCTTCGCC GCGACGGCGC GCGGCGTCGG GAAGTCGAAG ACCGCGGTGG 

^ 28801 CGGGCAGCCG TACGCCCGTC GCCTCGGTGA AGGCGTTGCG CAGCCGGATC GCCATGAGCG 

- ™ 28861 AGTCGACGCC GAGTTCCTTG AACGTGGCGG TCGCCTCGAC CCGTGCGGCA CCGTCGTGGC 
m 30 28921 CGAGTACGGC CGCGGTGCAC TGCCGGACGA CGGCGAGCAC GTCCTTTTCG GCGTCCGCGG 
fy 28981 CGGAGAGCCG CGCGATCCGG TCGGCGAGGG TGGTGGCGCC GGCCGCCCGG CGCCGCGGCT 

SJ 29041 CCCGGCGCGG TGCGCGCAGC AGGGGCGAGC TGCCGAGGCC GGCCGGGTCG GCGGCGACCA 

p 29101 GCGCCGGGTC CGAGGACCGC AACGCCGCGT CGAACAGCGT CAGTCCGCCT TCGGCGGTCA 

U 29161 GCGCCGTCAC GCCGTCGCGG CGCATGCGGG CGCCGGTGCC GACCGTCAGC CCGCTCTCCG 

35 29221 GTTCCCACAG GCCCCAGGCC ACGGACAACG CGGGCAGTCC GGCTGCCCGG CGCTGTTCGG 
29281 CCAGCGCGTC GAGGAACGCG TTCGCGGCCG CGTAGTTGCC CTGTCCGGGG CTGCCGAGCA 
2 9341 CACCGGCGGC CGACGAGTAG AGGACGAACG CGGCCAGTTC CGTGTCCTGG GTGAGTTCGT 
2 94 01 GCAGGTGCCA CGCGGCGTCC ACCTTCGGGC GCAGCACCGT CTCGAGCCGG TCGGGGGTGA 
2 94 61 GCGCGGTGAG GACGCCGTCG TCGAGGACGG CCGCGGTGTG CACGACGGCC GTGAGCGGGT 
•40 2 9521 GCGCCGGGTC GATCCCCGCC AGTACGGAGG CGAGTTCGTC CCGGTCGGCG ACGTCGCAGG 
2 9581 CGATCGCCGT GACCTCGGCG CCGGGCACGT CGCTCGCCGT GCCGCTGCGC GACAGCATCA 
2 9641 GCAGCCGGCG CACGCCGTGG CGTTCGACGA GGTGGCGGCT GATGATGCCG GCCAGCGTCC 
29701 CGGAGCCACC GGTGACGAGC ACGGTGCCGT CCGGGTCGAG CGCCGGAGCG TCACCCGCCG 
29761 GGACCGCCGG GGCCAGACGG CGGGCGTACA CCTGGCCGTC ACGCAGCACC ACCTGGGGCT 
45 29821 CATCGAGCGC GGTGGCCGCT GCGAGCAGCG GCTCGGCGGT GTCCGGGGCG GCGTCGACGA 
2 9881 GGACGATCCG GCCGGGGTGT TCGGCCTGCG CGGTCCGCAC CAGTCCGGCG GCCGCGGCCG 
2 9941 ACGCGAGACC GGGCCCGGTG TGGACGGCCA GGACCGCGTC GGCGTACCGG TCGTCGGTGA 
30001 GGAAGCGCTG CACGGCGGTC AGGACGCCGG CGCCCAGTTC GCGGGTGTCG TCGAGCGGGG 
30061 CACCGCCGCC GCCGTGCGCG GGGAGGATCA CCACGTCCGG GACCGTCGGG TCGTCGAGGC 
50 30121 GGCCGGTCGT CGCGGTCGTG GGCGGCAGCT CCGGGAGCTC GGCCAGCACC GGGCGCAGCA 
30181 GGCCCGGAAC GGCTCCCGTG ATCGTCAGGG GGCGCCTGCG CACGGCGCCG ATGGTGGCGA 
30241 CGGGCCCGCC GGTCTCGTCC GCGAGGTGTA CGCCGTCAGC GGTGACGGCG ACGCGTACCG 
30301 CCGTGGCGCC GGTGGCGTGG ACGCGGACGT CGTCGAACGC GTACGGAAGG TGGTCCCCTT 
30361 CCGCGGCGAG GCGGAGTGCG GCGCCGAGCA GCGCCGGGTG CAGGCCGTAC CGTCCGGCGT 
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304 21 CGGCGAGC7G TCCGTCGGCG AGGGCCACTT CCGCCCAGAC GGCGTCGTCG TCGGCCCAGA 
304 81 CGGCGCGCGG GCGGGGCAGC GCGGGCCCGT CCGTGTACCC GGCTCGGGCC AGACGGTCGG 
30541 CGATGTCGTC GGGGTCCACC GGCCGGGCCG TGGCGGGCGG CCACGTCGAC GGCATCTCCC 
30 601 GCACGGCCGG GGCCGTCCGC GGGTCGGGGG CGAGGATTCC GTGCGCGTGC TCGGTCCACT 
5 30661 CCCCCGCCGC GTGCCGCGTG TGCACGGTGA CCGCGCGGCG GCCGTCCGCC CCGGGCGCGC 
30721 TCACCGTGAC GGAGAGCGCG AGCGCACCGG ACCGCGGCAG CGTGAGGGGG GTGTCCACGG 
30781 TGAACGTGTC GAGGGCGCCG CAGCCGGCTT CGTCGCCCGC CCGGATCGCC AGATCCAGGA 
308 41 GGGCCGCGGC GGGCAGCACC GCGAGGCCGT GCAGGGAGTG CGCCAGCGGA TCGGCGGCGT 
30901 CGACCCGGCC GGTGAGCACC AGGTCGCCGG TGCCGGGCAG GGTGACCGCC GCGGTCAGCG 

10 30961 CCGGGTGCGC GACCGGCGTC TGTCCGGCCG GGGCCGCGTC GCCCGCGGTC TGGGTGCCGA 
31021 GCCAGTAGCG GACCCGCTCG AACGGGTACG TCGGCGGGTG CGAGGCGCGT GCCGGCGCGG 
31081 GGTCGATGAC CTTCGGCCAG TCGACCGTGA CGCCGTCGGT GTGCAGCCGG GCGAGCGCGG 
31141 TCAGGGCGGA TCGCGGTTCG TCGTCGGCGT GCAGCATCGG GATGCCGTCG ACGAGTCGGG 
31201 TCAGGCTCCG GTCCGGGCCG ATCTCCAGGA GCACCGCCCC GTCGTGCGCG GCGACCTGTT 

15 312 61 CCCCGAACCG GACGGTGTCG CGGACCTGTC GTACCCAGTA CTCCGGCGTG GTGCAGGCGG 
31321 CGCCCGCGGC CATCGGGATC CTCGGCTCGT GGTACGTCAG GCTCTCCGCG ACCTTGCGGA 
31381 ACTCCTCGAG CATCGGCTCC ATCCGCGCCG AGTGGAACGC GTGGCTGGTC CGCAGGCGGG 
31441 TGAAGCGGCC GAGCCGGGCC GCGACGTCGA GCACCGCCTC CTCGTCACCG GAGAGCACGA 
31501 TCGACGCGGG CCCGTTGACC GCGGCGATCT CCACGCCGTC CCGCAGCAGG GGCAGCGCGT 

20 315 61 CCCGTTCCGA CGCGATCACG GCGGCCATCG CCCCGCCGGA CGGCAGCGCC TGCATCAGGC 
31621 GGGCCCGTGC GGACACCAGC CTGCACGCGT CCTCCAGGGA CCAGACGCCG GCGACGTACG 
31681 CGGCGGCCAG CTCGCCGATC GAATGGCCCA CGAAGGCGTC CGGGCGTACG CCCCACGCCT 
317 41 CGAGCTGTGC GCCGAGTGCG ACCTGGAGCG CGAACACCGC GGGCTGGGCG TACCCGGTGT 
31801 CGTGGAGGTC GAGCCCGGCG GGCACGTCGA GGGCGTCCAG CACCTCGCGG CGAGTGCGGG 

25 318 61 CGAAGACGTC GTAGGCGGCG GCCAGTCCGT CGCCCATGCC GGGACGTTGT GAGCCCTGTC 
31921 CGGAGAAGAG CCACACGAGG CGGCGGTCCG GTTCTGCGGC GCCGGTGACC GTGTCGGTGC 
31981 CGATCAGCGC GGCCCGGTGC GGGAAGGCCG TGCGGGCGAG CAGGGCCGCG GCCACCGCGC 
32041 GCTCGTCCTC CTCGCCGGTG GCGAGGTGGG CGCGCAGGCG GTGTACCTGT GCGTCGAGTG 
32101 CCTGCGGGGT GCGTGCCGAG AGCAGCAGGG GCAGCGGTCC GGTGTCGGGT GCCGGGGCGG 

30 32161 GTTCGGGGGC CGGTCGGGGG TGGCTTTCGA GGATGATGTG AGCGTTGGTG CCGCTAACGC 
32221 CGAAGGAGGA CACCCCGGCG CGCCGTGGGC GGTCGGTTTC GGGCCAGGGG CGGGCGTCGG 
32281 TGAGGAGTTC GAGGGCGCCG GCCGTCCAGT CGACGTGCGA GGACGGCGTG TCCACGTGCA 
32341 GGGTGCGCGG CAGGGTGCCG TGCCGCATGG C GAG G AC CAT CTTGATGACA CCGGCGACGC 
32401 CCGCGGCGGC CTGAGTGTGG CCGATGTTGG ACTTCAGCGA GCCCAGCAGC ACCGGGGTGT 

35 324 61 CGCGATGCTG CCCGTAGGTG GCCAGTACCG CCTGCGCCTC GATGGGGTCG CCCAGCCTGG 
32521 TCCCGGTGCC ATGCGCCTCG ACAGCGTCCA CATCCGCCGG GGTGAGCCCG GCGTTGGCCA 
32581 GCGCCTGCCG GATCACCCGC TCCTGCGACG GCCCGTTCGG CGCCGACAAC CCGTTGGAAG 
32 641 CACCGTCCTG GTTGACCGCC GAACCACGCA CGACCGCCAG GACATTGTGG CCGTGCCGCT 
327 01 CGGCGTCGGA GAGCCTCTCG ACGATCAGCA CACCGGATCC CTCGGCGAAA CCGGTGCCAT 

40 327 61 CAGCCGCATC CGCGAACGCC TTGCAGCGGC CGTCCGGGGA GAGGCCCCGC TGCTGGGAGA 
32821 AGTCCACGAA GCCGGACGGC GAG GC CAT CA CCGTGACGCC GCCGACCACG GCGAGCGAGC 
32881 ACTCCCCCGA GCGCAGCGAC TGCCCGGCCT GGTGCAGCGC CACCAGCGAC GACGAACACG 
32 941 CCGTGTCCAC CGTGACCGCC GGACCCTCCA AACCGTAGAA GTACGACAGC CGACCGGACA 
33001 GCACACTGGT CTGGGTGCTG GTGGCACCGA AACCGCCGCG GTCGGCTCCA GTGCCGTACC 

45 33061 CGTAGAAGTA GCCGCCCATG AACACGCCGG TGTCGCTTCC GCGCAGCGAC TCCGGGAGGA 
33121 TCCCGGCGTG TTCCAGCGCC TCCCACGAGG TCTCCAGGAC CAGACGCTGC TGCGGGTCCA 
33181 TCGCCAGCGC CTCACGCGGA CTGATCCCGA AGAACGCCGC GTCGAAGTCC GCCACCCCGG 
33241 CGAGGAAGCC ACCATGACGC ACGGTCGACG TGCCCGGATG ATCCGGATCG GGATCGTACA 
33301 GCCCGTCCAC GTCCCAACCA CGGTCCGTCG GAAACGCCGT GATCCCGTCA CCACCCGACT 

50 33361 CCAGCAGCCG CCACAAGTCC TCCGGCGACG CGACCCCACC CGGCAGCCGG CAGGCCATCC 
33421 CCACGATCGC CAACGGCTCG TCCTGCCGGA CGGCCGCGGT CGGGGTACGC CGCCGGGTGG 
33481 TGGCCCGCGC GCCGGCCAGT TCGTCCAGGT GGGCGGCGAG CGCCTGCGCC GTGGGGTGGT 
33541 CGAAGACGAG CGTAGCGGGC AGCGTCAGGC CCGTCGCGTC GGCCAGCCGG TTGCGCAGTT 
33601 CGACGCCGGT CAGCGAGTCG AAGCCCACTT CCCTGAACGC GCGCGCGGGT GCGATGGCGT 
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33661 


GGGCGTCGCG 


GTGGCCGAPP 






33721 


GCGCGGCCGG 


AGGTGPPPAP 






33781 


GGACCCGGTC 


GGACGPGGrr; 






33841 


GGTCGGTGTG 


CAGQGCCCCC 




5 


33901 


TGCCGTTGCG 


GGCGATGPPP 






33961 


CCGCGTCCCA 


CAGTCCCCAG 






34021 


GGGCGAGCGC 


GTCGAGGAAP 






34081 


ACGTGGCGGA 


TATGGAPPAP 




10 


34141 


CGTGCAGGTG 


CPAPPPPAPP 




34201 


GCATGGTCGT 








34261 


GCTGGGCGAC 


GTCGGPPAPP 






34321 


CGTACCGCAC 


GPGPTPPTPP 

Uv^UO x OO x k_,o 






34381 


CGACCTCGGC 


GPPPTPPTPP 

OOOO X O O X O ^ 




15 


34441 


CGGTGCCGCC 


PPTPAPPAPP 

OO X O.tt.OO.rt.00 




34501 


CGACACGGCG 


PAPAPPPHPP 






34561 


CGCCGGCGGC 


PAPPPPPPPP 






34621 


CGACGCGGCC 


GGGATPPTPP 






34681 


CGGGATCGCC 


GGTAPGGPTP 

OO X fl^/VJUiJ 1 o 




20 


34741 


GCCAGGTCTG 


PAPGGTGPTP 
vnv/ 'o o x oo x o 




34801 


AGGTGCCCGG 


PTPPPPPPCT 
O X -^ooo X 






34861 


GCACGTCGGC 


GAGGTAPPTP 
unu o x ,0.00 x o 






34921 


TCTCGAACAG 


CGCCTCCX1CA 

— ' ^ — - * X \^\j\J\_sf\ 






34981 


GGACCGGTGA 


GCPGTGPTPG 




25. 


35041 


CCAGCAGCAC 


GPf^PAPPPPP 


p-k 


35101 


ACGCCAGCPP 


OOOwooo X 






35161 


CGAGCAGCAC 


PGGPTPPAPP 

OOOO X O^riO^ 


* 




35221 


ACGCGTAGGC 


GCGGCCCTCC 






35281 


ACGAGAGCGG 


PAGPGPGTPP 




30 


35341 


GGCAGTCCAC 


OOOO X OO 




35401 


GCGCCCAGGG 


OOOoOX OuuO 


fll 
", ~* 




35461 


CGGTTCCGAP 


Oo X uoL-Ui 






35521 


CGATGGTCAG 


PTPPPPPATP 
o x ^^O^uril o 






35581 


CCACGAGCGC 


PGAPPPPPPP 

O UAO O O O O O O 


IP 5 * 


35 


35641 


GCTGACGGCG 


TAPPGAPAP A 




35701 


CCCACGAGCC 


PAPPAPPPPP 
uno^novOoo 






35761 


GGTCACGGCG 


GAAPGGGTAP 
unn^ouu 1 rio 






35821 


TGACGGGCAC 


ppppppr: Apr 1 

oooooOo.rio O 






35881 


CCTCGCCTCG 


PPGPAGTPTP 

^vu O-rtO X Ox O 




40 


35941 


CCAGTGCGGT 


GPTPAPPAPP 

oo x OnOunLO 




36001 


CCGCCAGGTG 


GPPGPTPPPP 
wV/UOU X oOoo 






36061 


AGGCGGCGTC 


OOOOOOOOOO 






36121 


CCGGCGTGCG 


PGPAPTPATP 

OOO.rtO X OjtI. 1 o 






36181 


CATGCGCGGT 


GTGPGAPPPP 

v — ' x ooox^o oo o 




45 


36241 


GCAGCTCCTC 


CACGGPGTPP 




36301 


CGGCGACCTC 


CAGGPGPPPP 

v^iwvj OOOOO O 






36361 


CCATGCCGCC 


CTGCCCGGCC 






36421 


TCGCGGCGTC 


GTCCAGGGTG 






36481 


AGTGGCCGAC 


GACCGCGGCC 




50 


36541 


CCATCACCGC 


GAACGACGCG 




36601 


GCCGCTGGGC 


GATGACGTCC 






36661 


ACTCGCGGAG 


CCGCCGGGCG 






36721 


CCCACTGGGA 


GCCCTGCCCG 






36781 


TTCCCGTCAC 


GGCCCCCGGC 






36841 


GCACGACCGC 


CCGGTGGCGC 



-39- 



ACCGCGGCAG CGCTGGTACG GACGAGGTCG AGCATGTCGC 
GTGCGCCGGA CGGCCGGCAC GAGGGTGCGT AGGACCGGCG 
ACGGCGGCGA GGTCGAGCCG GATCGGCACG AGCGCGGGCC 
TCGAACAGGG CGAGCCCCTG TGCGGCCGTC ATCGGGGTCA 
GCCAGGTCGG TGGCGGTCAG CCGCCCGCCC ATCCCGTCCG 
GCGAGCGAGA CGGCGGGCAG CCCCTGGTGG TGCCGGTGGC 
GCGTTGCCGG TCGCGTAGTT GGCCTGACCC GCGCCGCCGA 
TACAGGACGA ACGCGGCCAG GTCGAGATCG CGCGTCAGCT 
TCCGCCTTGA CCCGCAGCAC GGCGTCCCAC TGCTCCGGCC 
TCGTCGACGA TCCCGGCCAT GTGCACGACG GCGCGCAGCC 
ACTGCGGCCA GCTCGTCGCG GTCGACGACG TCGGCGGCCA 
TCCGGCGTGT CGCCGGGCCG GCCGTTGCGG GACACCACGA 
ACGGTGAGCA GGTGGTCCAC GAGGAGGCGG CCGAGCCCGC 
ACGGTCCCGC CGGTCAGCGG GGAGGTTCCG GTGGCCGCGG 
GCACGCGCTG TGCCGTCGGC GACCCGGACG TGCGGCTCGT 
GCTATGGCGG CGGGCGTGAT CTCGTCCGCT TCGATCAGGG 
GTCTCCGCCG TCCGGACCAG GCCGCCGAGC GCTTCCTGCG 
GCCACGATGA GCCGGGATCG CGCCCAGCGC GGCTCGGCGA 
AGCAGGTCGC GGCCCAGCTC CCGGGTCCGG GCGCCGGGCG 
TCCACGGCCA GGACCACGAC CGGGGGGTGC TCGCCGTCGG 
CAGTCGGGGA CGGGTGACGC GGGCACGGGC ACCCAGGCGA 
TCGGGGTCGG CGGCCCGCAC GGTCAGGCTG TCGACGTCAA 
TCCGTGGCGA CGATGCGGAC CATGTCGGGG CCGACGCGTT 
GTCGCGGCGC GCGCGTGGAT CCTCACGCCG GACCAGGAGA 
GGGTCCGTGA AGACCGTCCC GAGGGCGTGC AGGGCCGCGT 
CCGTACCGGG CGTCGGTGAG CTGTTCGGCG AGGCGGACCG 
CCCGTCCACA TCGCGGTCAT GGCCCGGAAC GCGGGCCCGT 
TAGAAG CCGG TCAGGTCGGC CGGGTCGGCG TCGGCGGGCG 
GGACCGCCAG TGTCCACGCT CAGCGCTCCG GTCGCACTGA 
GTACGGCTGT GCAGACTCAC CGACCGCCGT CCGGACACCT 
ATCTCCGTGT CGCCGTCGCC GTCGACCACC ACCGGCGCGA 
TCCGGCGTGC CGAGCCGGGC TCCCGCTTCG GCGAGCAGTT 
ACGATGACCC GGCCGTCCAC CTCGTGGTCG GCGAGCCAGG 
CCGCGGTGGC CAGCGCGCCC TCGCCGTCGG GCGAGGTCGA 
TGGCCGGACG TTCCCGCCGG TTCCGCGTCG ATCCAGTAGC 
GTGGGCAGCG GCACCACCCG ACGCGTCGCG AACGACCAGG 
CAGAGCGCGG CGAGCGACCG AGTGAAGCGG TCCAGGCCGC 
CCGGTGACGA CCGTATGCGC ATGCCCGGCG AGCGTGTCCT 
GGATGCGCGC TGACCTCGAC GAACGCGCGG TATCCGCGGT 
GCGGCGAACC GAACGGTGCG GCGCAGGTTG TCGTACCAGT 
TCCAGCCACG CCTCGTCCAC GGTGGAGAAG AACGGGACGT 
CCGGCGAGAG CGTCGAGCAG CGCGCCGCGG ATCGTTTCGA 
TAGTCGACGG CGATCCGGCG GGCGCGGGGG GTGGCGGCCA 
GCCGCACCGG CGACAACGAT CGACGCGGGT CCGTTGACCG 
GCCCACACGG CGGCGTCGAA GTCGGCGGGC GGCACCGAGA 
AGTTCGGTGG CGACGAGTCG GCTGCGCACC GCGACGACCT 
AGCACCCCGG CGACGCAGGC CGCGGCGACT TCGCCCTGGG 
GGGGCGACCC CGTGCGCACG CCACAGCTCC GCCAGCGCCA 
GGCTGCACGA CATCGACCCG GTCGAACGCG GGCGCTCCGG 
AGCAGGTCCC ATCCGGTGTG CGGGGCGAGC GCCGTGGCGC 
AACACGGGCT CGGTGGCGAG CAGTTCGGCA CCCATGCCGG 
GGGAACGCGA AC AC G AC AC G TGTGTCGGTG ACGTCGGCGG 
ACTTCGGCAC CACGGGCGAA CGCCTCCGCC TCTCGGGCCG 
ATGGCCGTCC GGGTGGTGGC GAGCGAGTGG CCGACCGCGG 
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36901 CCGCGGCGCC AGTGAGCGGG GCCAGCTGTC CCGCGACGTC CCGCAGTCCC TCCGGGGTCC 
36961 GGGCCGACAT CGGCCAGACC ACGTCCTCGG GCACCGGCTC GGCTTCGGGT GCGGACACGG 
37021 GTGCGGGCGC GGCGGGGGGC CCGGCCTCCA GGACGACATG GGCGTTGGTG CCGCTGATGC 
37081 CGAACGACGA GACACCCGCA CGCCGGGCGC GCCCGGTGAC CGGCCACGGC TCACTGCGGT 
37141 GCAGCAGCCG GATGTCGCCG TCCCAGTCGA CGTGCCGGGA CGGCTCGTCG ACGTGCAGCG 
37201 TGCGCGGCAG GACGCCGTGC CGCATCGCCA TGACCATCTT GATGACGCCG GCGACGCCGG 
372 61 CCGCGGCCTG GGTGTGGCCG ATGTTCGACT TGAGCGAGCC GATCAGCAGC GGATGCACGC 
37321 GTTCGCGCCC GTAGGCCACT TGCAGGGCCT GGGCCTCGAC GGGGTCGCCG AGACGGGTGC 
37381 CGGTGCCGTG TGCCTCCACG GCGTCGACGT CACCCGGCGC CAGGCCGGCG TCGGCGAGCG 
374 41 CACGCTGGAT GACGCGCTGC TGCGCAGGCC CGTTCGGGGC GGACAGCCCG TTCGACGCGC 
37501 CGTCGGAGTT GACCGCGGAG CCGCGCACCA GCGCCAGCAC GGGGTGGCCG TGGCGGGTGG 
37561 CGTCGGAGAG CCGCTCCAGC ACCAGGACAC CGGCGCCCTC GGCGAAGCTC GTGCCGTCCG 
37621 CGGTGTCCGC GAAGGCCTTG GCACGGCCGT CGGGGGCGAG CCCGCGCTGC CGGGAGAACT 
37681 CGACGAACCC GGTCGTCGTC GCCATCACCG TGACACCGCC GACCAGGGCG AGCGAGCACT 

377 41 CCCCCGAGCG CAGCGACCGC GCGGCCTGGT GCAGCGCCAC CAGCGACGAC GAACACGCCG 
37801 TGTCGACGGT GACCGACGGG CCCTCCAGAC CGAAGTAGTA CGAGAGCCGC CCGGAGAGAA 

378 61 CGCTGGTCGG CGTGCCGGTC GCCCCGAAAC CGCCCAGGTC CACGCCCGCG CCGTAGCCCT 
37 921 GGGTGAACGC GCCCATGAAT ACGCCGGTGT CGCTGCCGCG GACGCTTTCG GGCAGGATGC 

37 981 CCGCTCGTTC GAACGCCTCC CACGACGCTT CGAGGACCAG ACGCTGCTGC GGGTCCATCG 
38041 CCAGCGCCTC ACGCGGGCTG ATCCCGAAGA ACGCGGCGTC GAAGTCGGCG GCGCCGGTGA 
38101 GGAAGCCGCC GTGACGCACG GAAACCTTGC CGACCGCGTC GGGGTTCGGG TCGTAGAGCG 
38161 CGGCGAGGTC CCAGCCGCGG TCGGCGGGGA ACTCGGTGAT CGCGTCCCCG CCGGAGTCGA 
38221 CCAGCCGCCA CAGGTCCTCC GGTGACCGCA CGCCACCGGG CATCCGGCAC GCCATGGCCA 
38281 CGATCGCCAG CGGCTCGTTC CCCGCCACCG TCGGTGCGGG CACTGTCGCC GCCGGAGCGG 
38341 CAGGGGCCGG CTCACCCCGC CGTTCCTCAT CCAGGCGGGC GGCGAGCGCG GCCGGTGTCG 
384 01 GGTGGTCGAA GACGGCCGTC GCGGAGAGCC GTACCCCCGT CGTCTCGGCG AGGCTGTTGC 
384 61 GCAACCGGAC ACCGCTGAGC GAGTCGATGC CGAGGTCCTT GAACGCCGTC GTGGGCGTGA 
38521 TCTCGGAGGC GTCGGCGTGG CCGAGCACGG CGGCCGTGGC CGCACACACG ATGGCCAGCA 
38581 GGTCACGATC GCGGTCGCGG TCGCGGTCGC GGTTGTCCTC CGCACGGGCG GCGATGCGGC 

38 641 GCTCGGTCCG CTGCCGGACG GGCTCGGTGG GAATCGCCGC GACCATGAAC GGCACGTCCG 
387 01 CGGCGAGGCT CGCGTCGATG AAGTGGGTGC CCTCGGCCTC GGTGAGCGGC CGGAACCCGT 

387 61 CGCGCACCCG CTGCCGGTCG GCGTCGTCAA GTTGTCCGGT GAGGGTGCTG GTGGTGTGCC 
38821 ACATGCCCCA GGCGATGGAG GTGGCGGGTT GGCCGAGGGT GTGGCGGTGG GTGGCGAGGG 

388 81 CGTCGAGGAA GGCGTTGGCG GCGGCGTAGT TTCCTTGTCC GGGGCTGCCG AGGACGGCGG 
38941 CGGCGCTGGA GTAGAGGACG AAGTGGGTGA GGGGTTGGTT TTGGGTGAGG TGGTGCAGGT 
39001 GCCAGGCGGC GTTGGCTTTG GGGTGGAGGA CGGTGGTGAG GCGGTCGGGG GTGAGGGCGT 
39061 CGAGGATGCC GTCGTCGAGG GTGGCGGCGG TGTGGAAGAC GGCGGTGAGG GGTTGGGGGA 
39121 TGTGGGCGAG GGTGGTGGCG AGTTGGTGGG GGTCGCCGAC GTCGCAGGGG AGGTGGGTGC 
39181 CGGGGGTGGT GTCGGGGGGT GGGGTGCGGG AGAGGAGGTA GGTGTGGGGG TGGTTCAGGT 
39241 GGCGGGCGAG GATGCCGGCG AGGGTGCCGG AGCCGCCGGT GATGATGATG GCGTGTTCGG 
39301 GGTTGAGGGG GGTGGTGGTG GGTGGGGTGG TGGTGTGGAG GGGGGTGAGG TGGGGTCGGT 
39361 GGAGGGTGTG GTGGGTGAGG CGGAGGTGGG GGTGGTCGAG GGTGGCGAGT TGGGCCAGGG 
39421 GGAGGGGAGT GTGGGGGTGG TCGGTTTCGA TGAGGCGGAT GCGGTGGGGG TGTTCGTTCT 
39481 GGGCGGTGCG GGTGAGGCCG GTGACGGTGG CGCCGGCGGG GTCGGTGGTG GTGTGGACGA 
39541 TGAGGGTGTG GTCGGTGGTG GTGAGGTGGT GTTGCAGGGC GGTCAGGACG CGGGTGGCGC 
39601 GGGTGTGGGC GCGGGTGGGT ATGTCCTCGG GGTCGTCGGG GTGGGCGGCG GTGATCAGGA 
39661 CGTGTCCCTC GGGCAGGTCA CCGTCGTAGA CCGCCTCGGC GACCGCGAGC CACTCCAACC 
39721 GGAGCGGGTT CGGCCCCGAC GGGGTGTCGG CCCGCTCCCT CAGCACCAGC GAGTCCACCG 
39781 ACACGACAGG ACGGCCATCC GGGTCGGCCA CGCGCACGGC GACGCCGGCC TCCCCCCGGG 
39841 TGAGGGCGAC GCGCACCGCG GCGGCCCCGG TGGCGTTCAG GCGCACGCCC GTCCAGGAGA 
39901 ACGGCAGCTC GATCCCGCCG CCCGCGTCGA GGCGCCCGGC GTGCAGGGCC GCGTCGAGCA 
39961 GTGCCGGATG CACACCGAAA CCGTCCGCCT CGGCGGCCTG CTCGTCGGGC AGCGCCACCT 
4 0021 CGGCATACAC GGTGTCACCA TCACGCCAGG CAGCCCGCAA CCCCTGGAAC GCCGACCCGT 
40081 ACTCATAACC GGCATCCCGC AGTTCGTCAT AGAACCCCGA GACGTCGACG GCCGCGGCCG 
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4 0141 TGGCCGGCGG CCACTGCGAG AACGGCTCAC CGGAAGCGTT GGAGGTATCC GGGGTGTCGG 
4 0201 GGGTCAGGGT GCCGCTGGCG TGCCGGGTCC AGCTGCCCGT GCCCTCGGTA CGCGCGTGGA 
40261 CGGTCACCGG CCGCCGTCCG GCCTCATCGG CCCCTTCCAC GGTCACCGAC ACATCCACCG 
4 0321 CTGCGGTCAC CGGCACCACG AGCGGGGATT CGATGACCAG TTCATCCACC ACCCCGCAAC 
4 0381 CGGTCTCGTC ACCGGCCCGG ATGACCAGCT CCACAAACGC CGTACCCGGC AGCAGAACCG 
40441 TGCCCCGCAC CGCGTGATCA GCCAGCCAGG GATGCGTACG CAATGAGATC CGGCCGGTGA 
40501 GAACAACACC ACCACCGTCG TCGGCGGGCA GTGCTGTGAC GGCGGCCAGC ATCGGATGCG 
40561 CCGCCCCGGT CAGCCCGGCC GCGGACAGGT CGGTGGCACC GGCCGCCTCC AGCCAGTACC 
40621 GCCTGTGCTC GAACGCGTAG GTGGGCAGAT CCAGCAGCCG CCCCGGCACC GGTTCGACCA 
4 0681 CCGTGCCCCA GTCCACCCCC GCACCCAGAG TCCACGCCTG CGCCAACGCC CCCAGCCACC 
40741 GCTCCCAGCC ACCGTCACCA GTCCGCAACG ACGCCACCGT GCGGGCCTGT TCCATCGCCG 
40801 GCAGCAGCAC CGGATGGGCA CTGCACTCCA CGAACACCGA CCCGTCCAGC TCCGCCACCG 
4 08 61 CCGCATCCAG CGCGACAGGG CGACGCAGGT TCCGGTACCA GTACCCCTCA TCCACCGGCT 
4 0921 CGGTCACCCA GGCGCTGTCC ACGGTCGACC ACCACGCCAC CGACCCGGTC CCGCCGGAAA 
40981 TTCCCTTCAG TACCTCAGCG AGTTCGTCCT CGATGGCCTC CACGTGAGGC GTGTGGGAGG 
41041 CGTAGTCGAC CGCGATACGA CGCACCCGCA CCCCATCAGC CTCATACCGC GCCACCACCT 
41101 CCTCCACCGC CGACGGGTCC CCCGCCACCA CCGTCGAAGC CGGACCATTA CGCGCCGCGA 
41161 TCCACACACC CTCGACCAGA CCCACCTCAC CGGCCGGCAA CGCCACCGAA GCCATCGCCC 
41221 CCCGGCCGGC CAGCCGCGCC GCGATCACCC GACTGCGCAA CGCCACCACG CGGGCGGCGT 
41281 CCTCCAGGCT GAGGGCTCCG GCCACACACG CCGCCGCGAT CTCCCCCTGC GAGTGTCCGA 
41341 CCACAGCGTC CGGCACGACC CCATGCGCCT GCCACAGCGC GGCCAGGCTC ACCGCGACCG 
41401 CCCAGCTGGC CGGCTGGACC ACCTCCACCC GCTCCGCCAC ATCCGACCGC GACAACATCT 
414 61 CCCGCACATC CCAGCCCGTG TGCGGCAACA ACGCCCGCGC ACACTCCTCC ATACGAGCCG 
41521 CGAACACCGC GGAACGGTCC ATGAGTTCCA CGCCCATGCC CACCCACTGG GCACCCTGCC 
41581 CGGGGAAGAC GAACACCGTA CGCGGCTGAT CCACCGCCAC ACCCATCACC CGGGCATCAC 
41641 CCAGCAGCAC CGCACGGTGA CCGAAGACAG CACGCTCACG CACCAACCCC TGCGCGACCG 
41701 CGGCCACATC CACCCCACCC CCGCGCAGAT ACCCCTCCAG CCGCTCCACC TGCCCCCGCA 
41761 GACTCACCTC ACCACGAGCC GACACCGGCA ACGGCACCAA CCCATCACCA CCCGACTCCA 
41821 CACGCGACGG CCCAGGAACA CCCTCCAGGA TCACGTGCGC GTTCGTACCG CTCACCCCGA 
41881 ACGACGACAC ACCCGCATGC GGTGCCCGAT CCGACTCGGG CCACGGCCTC GCCTCGGTGA 
41941 GCAGCTCCAC CGCACCGGCC GACCAGTCCA CATGCGACGA CGGCTCGTCC ACGTGCAGCG 
42001 TCTTCGGCGC GATCCCATGC CGCATCGCCA TGACCATCTT GATGACACCG GCGACACCCG 
42061 CAGCCGCCTG CGCATGACCG ATGTTCGACT TGACCGAACC GAGGTAGAGC GGCGTGTCGC 
42121 GGTCCTGCCC GTAGGCCGCG AGGACGGCCT GCGCCTCGAT CGGGTCGCCC AGCCGCGTGC 
42181 CGGTGCCGTG CGCCTCCACC ACGTCCACAT CGGCGGCGCG CAGTCCGGCG TTGACCAACG 
42241 CCTGCCGGAT CACGCGCTGC TGGGCGACGC CGTTGGGGGC GGACAGTCCG TTGGAGGCAC 
42301 CGTCCTGGTT CACCGCCGAG CCGCGGACGA CCGCGAGAAC GGTGTGCCCG TTGCGCTCGG 
42361 CGTCGGAGAG CCGCTCCAGC ACGAGAACGC CGACGCCCTC GGCGAAGCCG GTCCCGTCCG 
4 2421 CCGCGTCGGC GAACGCCTTG CACCGTCCGT CCGGGGAGAG TCCGCGCTGC CGGGAGAACT 
424 81 CCACGAGCTC TGCGGTGTTC GCCATGACGG TGACACCGCC GACCAGCGCC AGGGAGCACT 
42541 CCCCGGCCCG CAGTGCCTGT GCCGCCTGGT GCAGGGCGAC CAGCGACGAC GAGCACGCCG 
42601 TGTCGACCGT GACCGCCGGG CCCTGAAGTC CGTACACGTA CGAGAGGCGC CCGGACAGGA 
42661 CGCTCGTCTG CGTCGCCGTG ACACCGAGCC CGCCCAGGTC CCGGCCGACG CCGTAGCCCT 
42721 GGTTGAACGC GCCCATGAAC ACGCCGGTGT CGCTCTCCCG GAGCCTGTCC GGCACGATGC 
42781 CGGCGTTCTC GAACGCCTCC CAGGAGGTCT CCAGGATCAG GCGCTGCTGG GGGTCCATCG 
42841 CCAGCGCCTC GTTCGGACTG ATGCCGAAGA ACGCGGCGTC GAACCCGGCG CCGGCCAGGA 
42901 ATCCGCCGTG GCGTGTCGTG GAGCGGCCGG CCGCGTCCGG GTCCGGGTCG TACAGCGCGT 
42961 CGACGTCCCA GCCCCGGTCG GTGGGGAACT CGGTGATCGC CTCGGTACCG GCGGCGACGA 
43021 GCCGCCACAG GTCCTCCGGC GAGGCGACCC CGCCGGGCAG TCGGCACGCC ATGCCGACGA 
4 3081 TCGCGACGGG GTCGCCGGAG CCGAGGGTCT GGGCGGTCGC GGGTGCCGCT GTCGCGGAGC 
43141 CGGCGAGGTG GGCGGCGAAC GCACGCGGAG TGGGGTGGTC GAACGCGGTT GACGCGGGCA 
43201 CCCGCAGACC CGTCCGCGCG GCGACGGTGT TGGTGAACTC GACGGTGGTG AGCGAGTCGA 
432 61 GGCCGTTCTC GCGGAACGTG CGGTCCGGGG AGCAGTGTCC GGCGCCCGGC AGGCCCAGGA 
43321 CGGTGGCGAC GCTGTCGCGG ACCAGGTCGA GCAGTACGTC CTCCCGGCCC GCACGGGCCG 
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43381 


CGGCGAGGCG 


GTTCGCCCAP 






43441 


CGGTGAGGAT 


CGGCGGCGTG 






43501 


TCCGGGCCAC 


GAT G TAG GAG 






43561 


GCGCCGGCCG 


TTCGATGPPP 




5 


43621 


CCCGTGGCCG 


ggtptppppp 

uu x O JL OOOO o 






43681 


CGCCGGGGTT 


CGCGGPTTPP 

^»OOOO0 X X V- O 






43741 


GGAGCAGGCC 


GGPPAPPPTP 

0OO0.0.000 X O 






43801 


CGATCGGAGG 


CGGPAPGPTP 




10 


43861 


CGAACGCGTC 


CCGCGCACGG 




43921 


CGCGGTCGAA 


papptppapp 






43981 


CGGCCAGGTC 


GAAPPPPTPP 






44041 


ACCGGCCGCC 


GGGTrjPGAf^r" 
x ULuno'w 






44101 


TGAGCACGAC 


0 x oo-tto 0000 




15 


44161 


CATGGTCGGT 


GTPPAAPPPP 




44221 


CGTACACCTC 


GGGGGPGAGP 

OOOOOO VJAU O 






44281 


TCGCGGCGTG 


GAPPAPPAPP 

O-ttO Otto orio 0 






44341 


ACCAGGCGGT 


GPPPA APAPP 






44401 


GGATCCGTGC 


GAPPAPPPPP 


iLJj 


20 


44461 


GACCGAACAC 


GCGGTPGPPP 




44521 


TGCCCGCGGC 


PTPPPPPPPP 






44581 


CGTCGCGGAA 


GTTCAGCPPP 

X X V—-XAOOOOO 






44641 


GCGCGGCGGG 


ACGTPGAPPP 






44701 


GCGCAGCGCC 


^-ii W X VJ\JV_/lyJ3 \_fO 




25 


44761 


CGTAGGCCAC 


GCCGGPPPPP 




44821 


CGAGGTCGTC 


ATCGPPPTPP 

"X \_^OO0 O J. OO 






44881 


GGCGCAGCGC 


CTCGTPPPAP 

v-- X V— » O X O LiMu 






44941 


CGCCCACCGC 


GPGGPPGPTP 

O OO OO O OO X 0 






45001 


GCCGCTCCCR 


oriOo-tto 1 lLb 




30 


45061 


CCGGCAGCCP 


O O O O.tt.0 O O O O 




45121 


TGACGTGPPA 


P ATPTPPTPP 






45181 


GGATCGCCTC 


GGPPPPPAPP 

OO OO OOOtt.OO 






45241 


CGAGGACGGG 


GTGPGGPPPP 






45301 


CGACGGTCTC 


GATPTPPPPP 
oj-ix 0 x 00000 




35 


45361 


CCCGGCCGGT 


GATPPTPAPP 

Ort X OO X OttO 0 


45421 


ACCAGPPGTP 


O-rto bnb 0 






45481 


GGCTCGGCCC 


00 x uuuu^n^ 






45541 


CCGGGTPGAP 


PA APPPPZaPP 






45601 


GCGCATPPTP 


PAPPPTPTTP 
V-.tt.oOo X 0 1 I 0 




40 


45661 


CGAGPAGPPP 


PZ\ppppci\ a r* 




45721 


ATPPGPPPAP 

■ti JL O OO O O urlL. 


p APPPPPn rr 






45781 


GGAGGTAGPP 


PT AP A TPPTT* 






45841 


CGTCGAGGAC 


0 x otto 00000 






45901 


GGACGGCGAG 


CAGGPAPAPP 

^-"■iiO O 0-tt.O.tt.O O 




45 


45961 


GTTCGTCGTC 


CTCGGTPAPP 




46021 


CGCTGCGCTG 


TGCGGAAAPP 






46081 


TCCAGGCGGG 


TTCGTCCAGG 






46141 


CGAGGTCCTC 


GTAGGAGACG 






46201 


CGGTGCCGGT 


GCGGCGCACC 




50 


46261 


CGGAGTCCGT 


CAGGAAGTGG 




46321 


CGACGGCGGC 


GGCGCGGGCG 






46381 


GCAGCATCGC 


GACCCGGTCG 






46441 


GGCCGGCCCG 


GAGCCGGAGT 






46501 


TCCGGTCGCC 


GCGTCGCTCG 






46561 


CCACACGCGC 


CATGGAAACA 



-42- 



TCCTGTTCCG TGGCGTCGGG CTCGGCCGGT CCGGTCAGTG 
GCGCCCGCCA TCGTCGCGGC CCGCGCCCCG GCGGAACCGG 
CCGCCGCCCG CGATGGCCTT CTCGATCAGG TCGCCGGTGA 
GGCAGCGCGC GGACGGTGAC GGTGGGGAGT CCCTCCGCGG 
TCGGCGCCGG CCGGGCCGTC GAGCAGGACG TGCACGAGCG 
TCGGCTGCGG TGGTCACGTG GGTGAGGCCG GTCTCGTCGC 
TCGGCGTCCT CCCCGGTGAC CAGGACCGGC GCGTCCGGGC 
AGGACCATCT TGCCGGTGTG CCGGGCGTGG CTCATCCACG 
CGGATGTCCC ACGGCTGCAC CGGCAGCGGG CACAGCTCAC 
AGCAGTTCGA GGATCTCCCG CAGGCGCGCG GGATCCACGT 
TGGGCGGCGT GGCGGATGTC GGTCTTGCCC ATCTCGACGA 
AGGCCGATGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT 
GGGAAGGTGT CGGCGAACGC GGCGCTGCGG GAGTTCGCCA 
TCGGCGTGCA GCAGGTGTTG TTTGGCGGGA CTGGCGGTGG 
TGGCGGGCGA TCCGGGTCGC CGCCATGCCG ACACCGCCCG 
TTCTGGCCGG GTCGCAGCTC GCCCGCGTCG ACGAGGCCGT 
ATGGGCACGG ACGCGGCGAT GGGGAACGAC CATCCCCGTG 
CGGTCCGCGA CCACGCTGCG CCGGAACGCG TCCTGCACGA 
GGGGCCAGGT CGTCGACGCC GGGTCCGACT TCGGTCACGA 
ATCTCGCCCT CGCCCGGGTA GGTGCCGAGC GCGATCAGCA 
GCGGCGCGGA CGTCGATGCG GACCTCGCCG GCGGCCAGGG 
GGGCGACGAC GAGGTCGCGG AGCGTTCCGG AGGCGGGCGG 
GTCGGCAGGG GGGTGGTGTC CGCGCGTACC AGCCGGGGCA 
AGCGCGATCT GGGGTTCGCC GAGCGAGGCC GCGGCGGGGA 
GTGTCCACCA GCACGAACGA TCCGGGTTCG GCGGCCTGGC 
AGCCGGGCCT GGTCCGCGTC CGGGATCTCG GCCGGGCCGA 
ACGACCGTCC GGCGGGGTGA CGGGGTGCCG GGCAGGTCGC 
CACAGCGTGG CCTCGCCACT GCCGGTGGCG ACCAGATGGG 
GCGCGCTGGA CCTTGCCCGA CGCGGTGCGG GGGATCGTGG 
GGCACCTTGA AG TAG GC GAG CCGGCGGCGG CACTCGGCGA 
CGGGGGCCGT CGGAAACGAC GTAGAGCACG GGTATGTCGC 
CCCGCCGCGG CGGCGTCCCG GACACCGGCC ACCTCCTGGG 
GGGTGGATGT TCTCCCCGCC GCGGATGATC AGCTCCTTGA 
TGTCCGGTCT CGGCCTGACG TGCGAGGTCC CCGGTGCGGT 
TGGGCGGTCG CCTCCGGCTG GGCGTGGTAG C CG AG CAT G A 
AGCTCGCCCT CCTCGCCGGG TGCCACGTCG GCGCCGGACA 
GACAGGCCCG GCACGGGCAG CCCGCACGAG CCGGGAACCC 
GCGGTGAGCG AGCCGGTCGT CTCGGTGCAG CCGTACGTGT 
GTCGCCTCGA AATCCCTGGT GAGCGAGGCC GGCGAGGTGG 
CGCAGCGCGC GAGCCCGCGG CTCGCCGGAC ACGGCGCCGA 
GGCACGCCGA CGAGCACGGT GCTGGAGTGT TCGGCCAGGG 
AC GAAGCCGC CCAGGATACG GGCGGACGCG CCGACCGTGA 
TGGTGGCCGA GGCTGTGGAA CAGCGGGGCG GGCCAGAGCA 
CGCCAGGACG GCACGTCGCA GTGCATCGCG GACCACAGGC 
ACGCCCTTGG GACGGCCGGT GGTGCCGGAG GTGTAGAGCA 
CCGAGGTCGT CGCGGGGCGG GCACGGCGGC TCGGTCCCGG 
CAGTCCGGTG CCCGGCGCCC GACGAGCACG ACGGTGGCGT 
TGGTCGAGGT GGGTTTCGTC GGTGACCAGC ACGGTCGCGC 
GCGAGTTCGG CGTCGGCGGC GTCCGGGTTG AGCGGGACGG 
GCGGCGAGGT AGACCTCGAT GGTCTCGATC CGGTTGCCGA 
CCGCGGTCGA CGCCGGACGC GGCGAGGTGT CCGGCGAGCC 
TGCGTGTACG TCACGGCGCG TTGGGAATCC GTGTAGGCGA 
GCATGGATGC GGAGCAATTC GTGCAACGGC CGGATTGGTT 
CCTTTCTCTC GACCAACCGC ACAACAGCAC GGAACCGGCC 
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4 6621 AC GAG TAG AC GCCGGCGACG CTAGCAGCGT TTTCCGGACC GCCACCCCCT GAAGATCCCC 
4 6681 CTACCGTGGC CGGCCTCCCC GGACGCTCAT CTAGGGGGTT GCACGCATAC CGCCGTGCGT 
4 6741 AATTGCCTTC CTGATGACCG ATGCCGGACG CCAGGGAAGG GTGGAGGCGT TGTCCATATC 
4 6801 TGTCACGGCG CCGTATTGCC GCTTCGAGAA GACCGGATCA CCGGACCTCG AGGGTGACGA 
5 4 68 61 GACGGTGCTC GGCCTGATCG AGCACGGCAC CGGCCACACC GACGTGTCGC TGGTGGACGG 
4 6921 TGCTCCCCGG ACCGCCGTGC ACACCACGAC CCGTGACGAC GAGGCGTTCA CCGAGGTCTG 
4 6981 GCACGCACAG CGCCCTGTCG AGTCCGGCAT GGACAACGGC ATCGCCTGGG CCCGCACCGA 
47 041 CGCGTACCTG TTCGGTGTCG TGCGCACCGG CGAGAGCGGC AGGTACGCCG ATGCCACCGC 
4 7101 GGCCCTCTAC ACGAACGTCT TCCAGCTCAC CCGGTCGCTG GGGTATCCCC TGCTCGCCCG 
10 4 7161 GACCTGGAAC TACGTCAGCG GTATCAACAC GACGAACGCG GACGGGCTGG AGGTGTACCG 
4 7221 GGACTTCTGC GTGGGCCGCG CCCAGGCGCT CGACGAGGGC GGGATCGACC CGGCCACCAT 
4 7281 GCCCGCGGCC ACCGGTATCG GCGCCCACGG GGGCGGCATC ACCTGCGTGT TCCTCGCCGC 
4 7 341 CCGGGGCGGA GTGCGGATCA ACATCGAGAA CCCCGCCGTC CTCACGGCCC ACCACTACCC 
47401 GACGACGTAC GGTCCGCGGC CCCCGGTCTT CGCACGGGCC ACCTGGCTGG GCCCGCCGGA 
15 474 61 GGGGGGCCGG CTGTTCATCT CCGCGACGGC CGGCATCCTC GGACACCGAA CGGTGCACCA 
47521 CGGTGATGTG ACCGGCCAGT GCGAGGTCGC CCTCGACAAC ATGGCCCGGG TCATCGGCGC 
4 7 581 GGAGAACCTG CGGCGCCACG GCGTCCAGCG GGGGCACGTC CTCGCCGACG TGGACCACCT 
47 641 CAAGGTCTAC GTCCGCCGCC GCGAGGATCT CGATACGGTC CGCCGGGTCT GCGCCGCACG 
O 4 7701 CCTGTCGAGC ACCGCGGCCG TCGCCCTTTT GCACACCGAC ATAGCCCGCG AGGATCTGCT 

J3 20 4 7761 CGTCGAAATC GAAGGCATGG TGGCGTGACA ATACCCGGTA AAAGGCCCGC GACGCTGCGC 
4 7 821 CTCGGCGGAT CCGCGAAGAG AAAGAAGAGC GTCACCGCAC AGCGCGGCAG CCCGGTCCTT 
Jg ■ 47881 TCGTCCTTCG CACAGCGGCG GATCTGGTTT CTCCAGCAAT TGGACCCGGA GAGCAACGCC 

Q 4 7 941 TATAATCTCC CGCTCGTGCA ACGCCTGCGC GGTCTATTGG ACGCGCCGGC CCTGGAGCGT 

Q 4 8001 GCGCTGGCGC TCGTCGTCGC GCGCCACGAG GCGTTGCGGA CGGTGTTCGA CACCGCCGAC 

2 25 4 8061 GGCGAGCCCC TCCAGCGGGT GCTTCCCGCC CCGGAACACC TCCTGCGCCA CGCGCGGGCG 
^ 4 8121 GGCAGCGAGG AGGACGCCGC CCGGCTCGTC CGCGACGAGA TCGCCGCGCC GTTCGACCTC 

4 8181 GCCACCGGGC CGTTGATCAG GGCCCTGCTG ATCCGCCTCG GTGACGACGA CCACGTTCTC 
^ 48241 GCGGTGACCG TGCACCATGT CGCCGGCGAC GGCTGGTCGT TCGGGCTCCT CCAACATGAA 

Cl 4 8301 CTCGCAGCCC ACTACACGGC GCTGCGCGAC ACTGCCCGCC CTGCCGAACT GCCGCCGTTG 

HI 30 48361 CCGGTGCAGT ACGCCGACTT CGCCGCCTGG GAGCGGCGCG AACTCACCGG CGCCGGACTG 
fU 4 8421 GACAGGCGTC TGGCCTACTG GCGCGAGCAA CTCCGGGGCG CCCCGGCGCG GCTCGCCCTC 

.Sj 4 8481 CCCACCGACC GTCCCCGCCC GCCGGTCGCC GACGCGGACG CGGGCATGGC CGAGTGGCGG 

p 48541 CCGCCGGCCG CGCTGGCCAC CGCGGTCCTC ACGCTCGCGC GCGACTCCGG TGCGTCCGTG 

iTt 4 8 601 TTCATGACCC TGCTGGCGGC CTTCCAAGCG GTCCTCGCCC GGCAGGCGGG CACGCGGGAC 

35 48 661 GTGCTGGTCG GCACGCCCGT GGCGAACCGT ACGCGGGCGG CGTACGAGGG CCTGATCGGC 
48721 ATGTTCGTCA ACACGCTCGC GCTGCGCGGC GACCTCTCGG GCGATCCGTC GTTCCGGGAA 
48781 CTCCTCGACC GCTGCCGGGC CACGACCACG GACGCGTTCG CCCACGCCGA CCTGCCGTTC 
4 8841 GAGAACGTCA TCGAACTCGT CGCACCGGAA CGCGACCTGT CGGTCAACCC GGTCGTCCAG 
4 8 901 GTGCTGTTGC AGGTGCTGCG GCGCGACGCG GCGACGGCCG CGCTGCCCGG CATCGCGGCC 
40 4 8 961 GAACCGTTCC GCACCGGACG CTGGTTCACC CGCTTCGACC TCGAATTCCA TGTGTACGAG 
4 9021 GAGCCGGGTG GCGCGCTGAC CGGCGAACTG CTCTACAGCC GTGCGCTGTT CGACGAGCCA 
4 9081 CGGATCACGG GGTTGCTGGA GGAGTTCACG GCGGTGCTTC AGGCGGTCAC CGCCGACCCG 
4 9141 GACGTACGGC TGTCGCGGCT GCCGGCCGGC GACGCGACGG CGGCAGCGCC CGTGGTGCCC 
4 9201 TCGAACGACA CGGCGCGGGA CCTGCCCGTC GACACGCTGC CGGGCCTGCT GGCCCGGTAC 
45 4 9261 GCCGCACGCA CCCCCGGCGC CGTGGCCGTC ACCGACCCGC ACATCTCCCT CACCTACGCG 
4 9321 CAGCTGGACC GGCGGGCGAA CCGCCTCGCG CACCTGCTCC GCGCGCGCGG CACCGCCACC 
4 9381 GGCGACCTGG TCGGGATCTG CGCCGATCGC GGCGCCGACC TGATCGTCGG CATCGTGGGG 
4 9441 ATCCTCAAGG CGGGCGCCGC TTATGTGCCG CTGGACCCCG AACATCCTCC GGAGCGCACG 
4 9501 GCGTTCGTGC TGGCCGACGC GCAGCTGACC ACGGTGGTGG CGCACGAGGT CTACCGTTCC 
50 4 9561 CGGTTCCCCG ATGTGCCGCA CGTGGTGGCG TTGGACGACC CGGAGCTGGA CCGGCAGCCG 
4 9621 GACGACACGG CGCCGGACGT CGAGCTGGAC CGGGACAGCC TCGCCTACGC GATCTACACG 
4 9681 TCCGGGTCGA CCGGCAGGCC GAAGGCCGTG CTCATGCCGG GTGTCAGCGC CGTCAACCTG 
4 9741 CTGCTCTGGC AGGAGCGCAC GATGGGCCGC GAGCCGGCCA GCCGCACCGT CCAGTTCGTG 
4 9801 ACGCCCACGT TCGACTACTC GGTGCAGGAG ATCTTTTCCG CGCTGCTGGG CGGCACGCTC 
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4 98 61 GTCATCCCGC CGGACGAGGT GCGGTTCGAC CCGCCGGGAC TCGCCCGGTG GATGGACGAA 
4 9921 CAGGCGATTA CCCGGATCTA CGCGCCGACG GCCGTACTGC GCGCGCTGAT CGAGCACGTC 
4 9981 GATCCGCACA GCGACCAGCT CGCCGCCCTG CGGCACCTGT GCCAGGGCGG CGAGGCGCTG 
50041 ATCCTCGACG CGCGGTTGCG CGAGCTGTGC CGGCACCGGC CCCACCTGCG CGTGCACAAT 
5 50101 CACTACGGTC CGGCCGAAAG CCAGCTCATC ACCGGGTACA CGCTGCCCGC CGACCCCGAC 
50161 GCGTGGCCCG CCACCGCACC GATCGGCCCG CCGATCGACA ACACCCGCAT CCATCTGCTC 
50221 GACGAGGCGA TGCGGCCGGT TCCGGACGGT ATGCCGGGGC AGCTCTGCGT CGCCGGCGTC 
50281 GGCCTCGCCC GTGGGTACCT GGCCCGTCCC GAGCTGACCG CCGAGCGCTG GGTGCCGGGA 
50341 GATGCGGTCG GCGAGGAGCG CATGTACCTC ACCGGCGACC TGGCCCGCCG CGCGCCCGAC 
10 50401 GGCGACCTGG AATTCCTCGG CCGGATCGAC GACCAGGTCA AGATCCGCGG CATCCGCGTC 
504 61 GAACCGGGTG AGATCGAGAG CCTGCTCGCC GAGGACGCCC GCGTCACGCA GGCGGCGGTG 
50521 TCCGTGCGCG AGGACCGGCG GGGCGAGAAG TTCCTGGCCG CGTACGTCGT ACCGGTGGCC 
50581 GGCCGGCACG GCGACGACTT CGCCGCGTCG CTGCGCGCGG GACTGGCCGC CCGGCTGCCC 
50641 GCCGCGCTCG TGCCCTCCGC CGTCGTCCTG GTGGAGCGAC TGCCGAGGAC CACGAGCGGC 
15 50701 AAGGTGGACC GGCGCGCGCT GCCCGACCCG GAGCCGGGCC CGGCGTCGAC CGGGGCGGTT 
507 61 ACGCCCCGCA CCGATGCCGA GCGGACGGTG TGCCGGATCT TCCAGGAGGT GCTCGACGTC 
50821 CCGCGGGTCG GTGCCGACGA CGACTTCTTC ACGCTCGGCG GGCACTCCCT GCTCGCCACC 
^ 50881 CGGGTCGTCT CCCGCATCCG CGCCGAGCTG GGTGCCGATG TCCCGCTGCG TACGCTCTTC 

^ 50941 GACGGGCGGA CGCCCGCCGC GCTCGCCCGT GCGGCGGACG AGGCCGGCCC GGCCGCCCTG 

yy 20 51001 CCCCCGATCG CGCCCTCCGC GGAGAACGGG CCGGCCCCCC TCACCGCGGC ACAGGAACAG 
* 51061 ATGCTGCACT CGCACGGCTC GCTGCTCGCC GCGCCCTCCT ACACGGTCGC CCCGTACGGG 

43 51121 TTCCGGCTGC GCGGGCCACT CGACCGCGAA GCGCTCGACG CGGCACTGAC CCGGATCGCC 

0 .51181 GCGCGCCACG AGCCGCTGCG GACCGGGTTC CGCGATCGGG AACAGGTCGT CCGGCCGCCC 

yj 51241 GCTCCGGTGC GCGCCGAGGT GGTTCCGGTG CCGGTCGGCG ACGTCGACGC CGCGGTCCGG 

y: 25 51301 GTCGCCCACC GGGAGCTGAC CCGGCCGTTC GACCTCGTGA ACGGGTCGTT GCTGCGTGCC 
m 51361 GTGCTGCTGC CGCTGGGCGC CGAGGATCAC GTGCTGCTGC TGATGCTGCA CCACCTCGCC 

51421 GGTGACGGAT GGTCCTTCGA CCTCCTGGTC CGGGAGTTGT CGGGGACGCA ACCGGACCTT 
!1 514 81 CCGGTGTCCT AC AC GG AC GT GGCCCGGTGG GAACGGAGTC CGGCCGTGAT CGCGGCCAGG 

51541 GAGAACGACC GGGCCTACTG GCGCCGGCGG CTGGGGGGCG CCACCGCGCC GGAGCTGCCC 
W 30 51601 GCGGTCCGGC CCGGCGGGGC ACCGACCGGG CGGGCGTTCC TGTGGACGCT CAAGGACACC 
ril 51661 GCCGTCCTGG CGGCACGCCG GGTCGCGGAC GCCCACGACG CGACGTTGCA CGAAACCGTG 

\j 51721 CTCGGCGCCT TCGCCCTGGT CGTGGCGGAG ACCGCCGACA CCGACGACGT GCTCGTCGCG 

Q 51781 ACGCCGTTCG CGGACCGGGG GTACGCCGGG ACCGACCACC TCATCGGCTT CTTCGCGAAG 

51841 GTCCTCGCGC TGCGCCTCGA CCTCGGCGGC ACGCCGTCGT TCCCCGAGGT GCTGCGCCGG 
35 51901 GTGCACACCG CGATGGTGGG CGCGCACGCC CACCAGGCGG TGCCCTACTC CGCGCTGCGC 
51961 GCCGAGGACC CCGCGCTGCC GCCGGCCCCC GTGTCGTTCC AGCTCATCAG CGCGCTCAGC 
52021 GCGGAACTGC GGCTGCCCGG CATGCACACC GAGCCGTTCC CCGTCGTCGC CGAGACCGTC 
52081 GACGAGATGA CCGGCGAACT GTCGATCAAC CTCTTCGACG ACGGTCGCAC CGTCTCCGGC 
52141 GCGGTGGTCC ACGATGCCGC GCTGCTCGAC CGTGCCACCG TCGACGATTT GCTCACCCGG 
40 52201 GTGGAGGCGA CGCTGCGTGC CGCCGCGGGC GACCTCACCG TACGCGTCAC CGGTTACGTG 
52261 GAAAGCGAGT AGCCATGCCC GAGCAGGACA AGACAGTCGA GTACCTTCGC TGGGCGACCG 
52321 CGGAACTCCA GAAGACCCGT GCGGAACTCG CCGCGCACAG CGAGCCGTTG GCGATCGTGG 
52381 GGATGGCCTG CCGGCTGCCC GGCGGGGTCG CGTCGCCGGA GGACCTGTGG CAGTTGCTGG 

524 41 AGTCCGGTGG CGACGGCATC ACCGCGTTCC CCACGGACCG GGGCTGGGAG ACCACCGCCG 
45 52501 ACGGTCGCGG CGGCTTCCTC ACCGGGGCGG CCGGCTTCGA CGCGGCGTTC TTCGGCATCA 

525 61 GCCCGCGCGA GGCGCTGGCG ATGGACCCGC AGCAGCGCCT GGCCCTGGAG ACCTCGTGGG 
52621 AGGCGTTCGA GCACGCGGGC ATCGATCCGC AGACGCTGCG GGGCAGTGAC ACGGGGGTGT 
52681 TCCTCGGCGC GTTCTTCCAG GGGTACGGCA TCGGCGCCGA CTTCGACGGT TACGGCACCA 
527 41 CGAGCATTCA CACGAGCGTG CTCTCCGGCC GCCTCGCGTA CTTCTACGGT CTGGAGGGTC 

50 52801 CGGCGGTCAC GGTCGACACG GCGTGTTCGT CGTCGCTGGT GGCGCTGCAC CAGGCCGGGC 
52861 AGTCGCTGCG CTCCGGCGAA TGCTCGCTCG CCCTGGTCGG CGGCGTCACG GTGATGGCCT 
52921 CGCCGGCGGG GTTCGCGGAC TTCTCCGAGC AGGGCGGCCT GGCCCCCGAC GCGCGCTGCA 
52981 AGGCCTTCGC GGAAGCGGCT GACGGCACCG GTTTCGCCGA GGGGTCCGGC GTCCTGATCG 
53041 TCGAGAAGCT CTCCGACGCC GAGCGCAACG GCCACCGCGT GCTGGCGGTC GTCCGGGGTT 
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53101 CCGCCGTCAA CCAGGACGGT GCCTCCAACG GGCTGTCCGC GCCGAACGGG CCGTCGCAGG 
53161 AGCGGGTGAT CCGGCAGGCC CTGGCCAACG CCGGACTCAC CCCGGCGGAC GTGGACGCCG 
53221 TCGAGGCCCA CGGCACCGGC ACCAGGCTGG GCGACCCCAT CGAGGCACAG GCCGTGCTGG 
53281 CCACCTACGG GCAGGGGCGC GACACCCCTG TGCTGCTGGG CTCGCTGAAG TCCAACATCG 
5 53341 GCCACACCCA GGCCGCCGCG GGCGTCGCCG GTGTCATCAA GATGGTCCTC GCCATGCGGC 
534 01 ACGGCACCCT GCCCCGCACC CTGCACGTGG ACACGCCGTC CTCGCACGTC GACTGGACGG 
534 61 CCGGCGCCGT CGAACTCCTC ACCGACGCCC GGCCCTGGCC CGAAACCGAC CGCCCACGGC 
53521 GCGCCGGTGT CTCCTCCTTC GGCGTCAGCG GCACCAACGC CCACATCATC CTCGAAAGCC 
53581 ACCCCCGACC GGCCCCCGAA CCCGCCCCGG CACCCGACAC CGGACCGCTG CCGCTGCTGC 

10 53641 TCTCGGCCCG CACCCCGCAG GCACTCGACG CACAGGTACA CCGCCTGCGC GCGTTCCTCG 
53701 ACGACAACCC CGGCGCGGAC CGGGTCGCCG TCGCGCAGAC ACTCGCCCGG CGCACCCAGT 
537 61 TCGAGCACCG CGCCGTGCTG CTCGGCGACA CGCTCATCAC CGTGAGCCCG AACGCCGGCC 
53821 GCGGACCGGT GGTCTTCGTC TACTCGGGGC AAAGCACGCT GCACCCGCAC ACCGGGCGGC 
53881 AACTCGCGTC CACCTACCCC GTGTTCGCCG AAGCGTGGCG CGAGGCCCTC GACCACCTCG 

15 53941 ACCCCACCCA GGGCCCGGCC ACGCACTTCG CCCACCAGAC CGCGCTCACC GCGCTCCTGC 
54 001 GGTCCTGGGG CATCACCCCG CACGCGGTCA TCGGCCACTC CCTCGGTGAG ATCACCGCCG 
54061 CGCACGCCGC CGGTGTCCTG TCCCTGAGGG ACGCGGGCGC GCTCCTCACC ACCCGCACCC 
54121 GCCTGATGGA CCAACTGCCG TCGGGCGGCG CGATGGTCAC CGTCCTGACC AGCGAGGAAA 
54181 AGGCACGCCA GGTGCTGCGG CCGGGCGTGG AGATCGCCGC CGTCAACGGC CCCCACTCCC 

20 54241 TCGTGCTGTC CGGGGACGAG GAAGCCGTAC TCGAAGCCGC CCGGCAGCTC GGCATCCACC 
54301 ACCGCCTGCC GACCCGCCAC GCCGGCCACT CCGAGCGCAT GCAGCCACTC GTCGCCCCCC 
54 361 TCCTCGACGT CGCCCGGACC CTGACGTACC ACCAGCCCCA CACCGCCATC CCCGGCGACC 
54 4 21 CCACCACCGC CGAATACTGG GCGCACCAGG TCCGCGACCA AGTACGTTTC CAGGCGCACA 
54 4 81 CCGAGCAGTA CCCGGGCGCG ACGTTCCTCG AGATCGGCCC CAACCAGGAC CTCTCGCCGC 

25 54 541 TCGTCGACGG CGTTGCCGCC CAGACCGGTA CGCCCGACGA GGTGCGGGCG CTGCACACCG 
54 601 CGCTCGCGCA GCTCCACGTC CGCGGCGTCG CGATCGACTG GACGCTCGTC CTCGGCGGGG 
54 661 ACCGCGCGCC CGTCACGCTG CCCACGTATC CGTTCCAGCA CAAGGACTAC TGGCTGCGGC 
54721 CCACCTCCCG GGCCGATGTG ACCGGCGCGG GGCAGGAGCA GGTGGCGCAC CCGCTGCTCG 
54781 GCGCCGCGGT CGCGCTGCCC GGCACGGGCG GAGTCGTCCT GACCGGCCGC CTGTCGCTGG 

30 54 841 CCTCCCATCC GTGGCTCGGC GAGCACGCGG TCGACGGCAC CGTGCTCCTG CCCGGCGCGG 
54 901 CCTTCCTCGA ACTCGCGGCG CGCGCCGGCG ACGAGGTCGG CTGCGACCTG CTGCACGAAC 

54 961 TCGTCATCGA GACGCCGCTC GTGCTGCCCG CGACCGGCGG TGTGGCGGTC TCCGTCGAGA 
55021 TCGCCGAACC CGACGACACG GGGCGGCGGG CGGTCACCGT CCACGCGCGG GCCGACGGCT 
55081 CGGGCCTGTG GACCCGACAC GCCGGCGGAT TCCTCGGCAC GGCACCGGCA CCGGCCACGG 

35 55141 CCACGGACCC GGCACCCTGG CCGCCCGCGG AAGCCGGACC GGTCGACGTC GCCGACGTCT 
55201 ACGACCGGTT CGAGGACATC GGGTACTCCT ACGGACCGGG CTTCCGGGGG CTGCGGGCCG 
55261 CCTGGCGCGC CGGCGACACC GTGTACGCCG AGGTCGCGCT CCCCGACGAG CAGAGCGCCG 
55321 ACGCCGCCCG TTTCACGCTG CACCCCGCGC TGCTCGACGC CGCGTTCCAG GCCGGCGCGC 
55381 TGGCCGCGCT CGACGCACCC GGCGGGGCGG CCCGACTGCC GTTCTCGTTC CAGGACGTCC 

40 554 41 GCATCCACGC GGCCGGGGCG ACGCGGCTGC GGGTCACGGT CGGCCGCGAC GGCGAGCGCA 
55501 GCACCGTCCG CATGACCGGC CCGGACGGGC AGCTGGTGGC CGTGGTCGGT GCCGTGCTGT 
55561 CGCGCCCGTA CGCGGAAGGC TCCGGTGACG GCCTGCTGCG CCCGGTCTGG ACCGAGCTGC 
55621 CGATGCCCGT CCCGTCCGCG GACGATCCGC GCGTGGAGGT CCTCGGCGCC GACCCGGGCG 
55681 ACGGCGACGT TCCGGCGGCC ACCCGGGAGC TGACCGCCCG CGTCCTCGGC GCGCTCCAGC 

45 55741 GCCACCTGTC CGCCGCCGAG GACACCACCT TGGTGGTACG GACCGGCACC GGCCCGGCCG 
5 5801 CTGCCGCCGC CGCGGGTCTG GTCCGCTCGG CGCAGGCGGA GAACCCCGGC CGCGTCGTGC 
55861 TCGTCGAGGC GTCCCCGGAC ACCTCGGTGG AGCTGCTCGC CGCGTGCGCC GCGCTGGACG 

55 921 AACCGCAGCT GGCCGTCCGG GACGGCGTGC TCTTCGCGCC GCGGCTGGTC CGGATGTCCG 
55981 ACCCCGCGCA CGGCCCGCTG TCCCTGCCGG ACGGCGACTG GCTGCTCACC CGGTCCGCCT 

50 56041 CCGGCACGTT GCACGACGTC GCGCTCATAG CCGACGACAC GCCCCGGCGG GCGCTCGAAG 
56101 CCGGCGAGGT CCGCATCGAC GTCCGCGCGG CCGGACTGAA CTTCCGCGAT GTGCTGATCG 
56161 CGCTCGGGAC G T AC AC CGGG GCCACGGCCA TGGGCGGCGA GGCCGCGGGC GTCGTGGTGG 
56221 AGACCGGGCC CGGCGTGGAC GACCTGTCCC CCGGCGACCG GGTGTTCGGC CTGACCCGGG 
56281 GCGGCATCGG CCCGACGGCC GTCACCGACC GGCGCTGGCT GGCCCGGATC CCCGACGGCT 
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56341 GGAGCTTCAC CACGGCGGCG TCCGTCCCGA 
56401 TCGACCTCGG CACACTGCGC GCCGGCGAGA 
564 61 TCGGCATGGC CGCCGCACAG ATCGCCCGCC 
56521 GTACCGGCAA GCAGCACGTC CTGCGCGCCG 
56581 CTCGGACGAC CGCGTTCCGG ACCGCTTTCC 
56641 CCGGCGAGTT CATCGACGCG TCGCTCGACC 
5 6701 TGGGCCGCAC CGAGCTGCGC GACCCGGCCG 
56761 TGCTGGACGC GGGCGCCGAC CGCATCGGCG 
56821 ACGCGGGCGC GCTGGAGCCG CTGCCGGTCC 
56881 CGCTCGGCTG GATGAGCCGC GCCCGCCACA 
56941 CGCTCGACCC GGAGGGCGCC GTCGTCCTCA 
57001 TCGCCCGCCA CCTGCGCGAA CGGCATGTCT 
57061 GGACGCCCGG CGTCCACCTG CCCTGCGACG 
57121 TGGAGCGGGT GGACCGGCCG ATCACCGCCG 
57181 GCACCGTCGC GTCGCTCACC CCCGAGCGTT 
57241 GCGCCTGGTA CCTGCACGAG CTGACGAAGG 
57301 CGTCGGCCGC CGGCGTGCTC GGCAACGCCG 
57361 TCCTCGACGC GCTCGCCGAG CTGCGCCACG 
57 421 GGGGGCTCTG GGAGGACGTG AGCGGGCTCA 
574 81 GGATGCGGCG CAGCGGTTTC CGGGCCATCA 
57541 CGGCCGGCCG CACCGGAAGT CCCGTGGTGG 
57 601 TGCCGCTGCT GCGCGGCCTG CGGCGGACGA 

57 661 CGTCCGCCGA CCGGCTCGCC GCGCTGACCG 
57721 TCGTCCGGGA GAGCACCGCC GCCGTGCTCG 
577 81 CGGCGGCGTT CAAGGACCTC GGCATCGACT 
57841 TCACCGAGGC GACCGGTGTG CGGCTGAACG 
57901 ACGTGCTCGC CGGGAAGCTC GGCGACGAAC 
57961 GGACCGCGGC CACGGCCGGT GCGCACGACG 
58021 GGCTGCCCGG CGGGGTCGCG T-CACCCGAGG 
58081 ACGCCATCAC GGAGTTCCCG ACGGACCGCG 
58141 ACCCCGACGC GATCGGCAAG ACCTTCGTCC 
58201 GCTTCGACGC GGCGTTCTTC GGCATCAGCC 
58261 AGCGGGTGCT CCTGGAGACG TCGTGGGAGG 
58321 CGACCCGCGG CAGCGACACC GGCGTGTTCG 
58381 GTGCGGACAC CGACGGCTTC GGCGCGACCG 
584 41 TGTCGTACTT CTACGGTCTG GAGGGTCCGG 
58501 CGCTGGTGGC GCTGCACCAG GCCGGGCAGT 
58561 TGGTCGGCGG CGTCACGGTG ATGGCGTCTC 

58 621 GCGGCCTCGC GCCGGACGGC CGGGCGAAGG 
58 681 TCGCCGAGGG TGCCGGTGTG CTGATCGTCG 
587 41 ACACCGTCCT GGCGGTCGTC CGTGGTTCGG 
58801 TGTCGGCGCC GAACGGGCCG TCGCAGGAGC 
58861 GGCTCACCCC GGCGGACGTG GACGCCGTCG 
58 921 ACCCCATCGA GGCACAGGCG GTACTGGCCA 
58 981 TGCTGGGCTC GCTGAAGTCC AACATCGGCC 
5 9041 TCATCAAGAT GGTGCAGGCC CTCCGGCACG 
5 9101 AGCCGTCGCC GCACGTCGAC TGGACGGCCG 
59161 CGTGGCCCGA GACCGACCGG CCACGGCGTG 
59221 CCAACGCCCA CGTCATCCTG GAGGCCGGAC 
59281 CCGGTGACCT TCCCCTGCTG GTGTCGGCAC 
59341 GCCGACTGCG CGCCTACCTG GACACCACCC 
594 01 CGCTGGCCCG GCGCACACAC TTCGCCCACC 
594 61 CCACACCCCC CGCGGACCGG CCCGACGAAC 
59521 AGCATCCCGC GATGGGCGAG CAGCTCGCCG 



TCGTGTTCGC GACCGCGTGG TACGGCCTGG 
AGGTCCTCGT CCACGCGGCC ACCGGCGGTG 
ACCTGGGCGC CGAGCTCTAC GCCACCGCCA 
CCGGGCTGCC CGACACGCAC ATCGCCGACT 
CGCGCATGGA CGTCGTCCTG AACGCGCTGA 
TGCTGGACGC CGACGGCCGG TTCGTCGAGA 
CGATCGTCCC CGCCTACCTG CCGTTCGACC 
AGATCCTGGG CGAACTGCTC CGGCTGTTCG 
GTGCCTGGGA CGTCCGGCAG GCACGCGACG 
TCGGCAAGAA CGTCCTGACG CTGCCCCGGC 
CCGGCGGCTC CGGCACGCTC GCCGGCATCC 
ACCTGCTGTC CCGGACGGCA CCGCCCGAGG 
TCGGTGACCG GGACCAGCTG GCGGCGGCCC 
TGGTGCACCT CGCCGGTGCG CTGGACGACG 
TCGACACGGT GCTGCGCCCG AAGGCCGACG 
AGCAGGACCT CGCCGCGTTC GTGCTCTACT 
GCCAGGGCAA CTACGTCGCC GCGAACGCGT 
GTTCCGGGCT GCCGGCCCTC TCCATCGCCT 
CCGCGGCGCT CGGCGAAGCC GACCGGGACC 
CCGCGCAACA GGGCATGCAC CTGTACGAGG 
TCGCGGCGGC GCTCGACGAC GCGCCGGACG 
CCGTCCGGCG GGCCGCCGTC CGGGAGTGTT 
GCGACGAGCT CGCCGAAGCG CTGCTGACGC 
GCCACGTGGG TGGCGAGGAC ATCCCCGCGA 
CGCTCACCGC GGTCCAGCTG CGCAACGCCC 
CCACGGCGGT CTTCGACTTC CCGACCCCGC 
TGACCGGCAC CCGCGCGCCC GTCGTGCCCC 
AGCCGCTGGC GATCGTGGGA ATGGCCTGCC 
AGCTGTGGCA CCTCGTGGCA TCCGGCACCG 
GCTGGGACGT CGACGCGATC TACGACC CGG 
GGCACGGTGG CTTCCTCACC GGCGCGACAG 
CGCGCGAGGC CCTCGCGATG GACCCGCAGC 
CGTTCGAAAG CGCCGGCATC ACCCCGGACT 
TCGGCGCCTT CTCCTACGGT TACGGCACCG 
GCTCGCAGAC CAGTGTGCTC TCCGGCCGGC 
CGGTCACGGT CGACACGGCG TGTTCGTCGT 
CGCUGCGCTC CGGCGAATGC TCGCTCGCCC 
CCGGCGGCTT CGTGGAGTTC TCCCGGCAGC 
CGTTCGGCGC GGGTGCGGAC GGCACGAGCT 
AGAGGCTCTC CGACGCCGAA CGCAACGGTC 
CGGTCAACCA GGATGGTGCC TCCAACGGGC 
GGGTGATCCG GCAGGCCCTG GCCAACGCCG 
AGGCCCACGG CACCGGCACC AGGCTGGGCG 
CCTACGGACA GGAGCGCGCC ACCCCCCTGC 
ACGCCCAGGC CGCGTCCGGC GTCGCCGGCA 
GGGAGCTGCC GCCGACGCTG CACGCCGACG 
GCGCCGTCGA ACTGCTGACG TCGGCCCGGC 
CCGCCGTCTC CTCGTTCGGG GTGAGCGGCA 
CGGTAACGGA GACGCCCGCG GCATCGCCTT 
GCTCACCGGA AGCGCTCGAC GAGCAGATCC 
CGGACGTCGA CCGGGTGGCC GTGGCACAGA 
GCGCCGTGCT GCTCGGTGAC ACCGTCATCA 
TCGTCTTCGT CTACTCCGGC CAGGGCACCC 
CCGCCCATCC CGTGTTCGCC GACGCCTGGC 
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59581 ATGAAGCGCT CCGCCGCCTT GACAACCCCG ACCCCCACGA CCCCACGCAC AGCCAGCATG 
59641 TGCTCTTCGC CCACCAGGCG GCGTTCACCG CCCTCCTGCG GTCCTGGGGC ATCACCCCGC 
59701 ACGCGGTCAT CGGCCACTCG CTGGGCGAGA TCACCGCGGC GCACGCCGCC GGCATCCTGT 
59761 CGCTGGACGA CGCGTGCACC CTGATCACCA CGCGCGCCCG CCTCATGCAC ACGCTCCCGC 
59821 CACCCGGTGC CATGGTCACC GTACTGACCA GCGAAGAGAA GGCACGCCAG GCGTTGCGGC 
59881 CGGGCGTGGA GATCGCCGCC GTCAACGGGC CCCACTCCAT CGTGCTGTCC GGGGACGAGG 
59941 ACGCCGTGCT CACCGTCGCC GGGCAGCTCG GCATCCACCA CCGCCTGCCC GCCCCGCACG 
60001 CCGGGCACTC CGCGCACATG GAGCCCGTGG CCGCCGAGCT GCTCGCCACC ACCCGCGGGC 
60061 TCCGCTACCA CCCTCCCCAC ACCTCCATTC CGAACGACCC CACCACCGCT GAGTACTGGG 
60121 CCGAGCAGGT CCGCAAGCCC GTGCTGTTCC ACGCCCACGC GCAGCAGTAC CCGGACGCCG 
60181 TGTTCGTGGA GATCGGCCCC GCCCAGGACC TCTCCCCGCT CGTCGACGGG ATCCCGCTGC 
60241 AGAACGGCAC CGCGGACGAG GTGCACGCGC TGCACACCGC GCTCGCGCAC CTCTACGCGC 
60301 GCGGTGCCAC GCTCGACTGG CCCCGCATCC TCGGGGCTGG GTCACGGCAC GACGCGGATG 
60361 TGCCCGCGTA CGCGTTCCAA CGGCGGCACT ACTGGATCGA GTCGGCACGC CCGGCCGCAT 
604 21 CCGACGCGGG CCACCCCGTG CTGGGCTCCG GTATCGCCCT CGCCGGGTCG CCGGGCCGGG 
604 81 TGTTCACGGG TTCCGTGCCG ACCGGTGCGG ACCGCGCGGT GTTCGTCGCC GAGCTGGCGC 
60541 TGGCCGCCGC GGACGCGGTC GACTGCGCCA CGGTCGAGCG GCTCGACATC GCCTCCGTGC 
60601 CCGGCCGGCC GGGCCATGGC CGGACGACCG TACAGACCTG GGTCGACGAG CCGGCGGACG 
60661 ACGGCCGGCG CCGGTTCACC GTGCACACCC GCACCGGCGA CGCCCCGTGG ACGCTGCACG 
60721 CCGAGGGGGT GCTGCGCCCC CATGGCACGG CCCTGCCCGA TGCGGCCGAC GCCGAGTGGC 
60781 CCCCACCGGG CGCGGTGCCC GCGGACGGGC TGCCGGGTGT GTGGCGCCGG GGGGACCAGG 
60841 TCTTCGCCGA GGCCGAGGTG GACGGACCGG ACGGTTTCGT GGTGCACCCC GACCTGCTCG 
60901 ACGCGGTCTT CTCCGCGGTC GGCGACGGAA GCCGCCAGCC GGCCGGATGG CGCGACCTGA 
60961 CGGTGCACGC GTCGGACGCC ACCGTACTGC GCGCCTGCCT CACCCGGCGC ACCGACGGAG 
61021 CCATGGGATT CGCCGCCTTC GACGGCGCCG GCCTGCCGGT ACTCACCGCG GAGGCGGTGA 
61081 CGCTGCGGGA GGTGGCGTCA CCGTCCGGCT CCGAGGAGTC GGACGGCCTG CACCGGTTGG 
61141 AGTGGCTCGC GGTCGCCGAG GCGGTCTACG ACGGTGACCT GCCCGAGGGA CATGTCCTGA 
61201 TCACCGCCGC CCACCCCGAC GACCCCGAGG ACATACCCAC CCGCGCCCAC ACCCGCGCCA 
61261 CCCGCGTCCT GACCGCCCTG CAACACCACC TCACCACCAC CGACCACACC CTCATCGTCC 
61321 ACACCACCAC CGACCCCGCC GGCGCCACCG TCACCGGCCT CACCCGCACC GCCCAGAACG 
61381 AACACCCCCA CCGCATCCGC CTCATCGAAA CCGACCACCC CCACACCCCC CTCCCCCTGG 
614 41 CCCAACTCGC CACCCTCGAC CACCCCCACC TCCGCCTCAC CCACCACACC CTCCACCACC 
61501 CCCACCTCAC CCCCCTCCAC ACCACCACCC CACCCACCAC CACCCCCCTC AACCCCGAAC 
61561 ACGCCATCAT CATCACCGGC GGCTCCGGCA CCCTCGCCGG CATCCTCGCC CGCCACCTGA 
61621 ACCACCCCCA CACCTACCTC CTCTCCCGCA CCCCACCCCC CGACGCCACC CCCGGCACCC 
61681 ACCTCCCCTG CGACGTCGGC GACCCCCACC AACTCGCCAC CACCCTCACC CACATCCCCC 
61741 AACCCCTCAC CGCCATCTTC CACACCGCCG CCACCCTCGA CGACGGCATC CTCCACGCCC 
61801 TCACCCCCGA CCGCCTCACC ACCGTCCTCC ACCCCAAAGC CAACGCCGCC TGGCACCTGC 
61861 ACCACCTCAC CCAAAACCAA CCCCTCACCC ACTTCGTCCT CTACTCCAGC GCCGCCGCCG 
61921 TCCTCGGCAG CCCCGGACAA GGAAACTACG CCGCCGCCAA CGCCTTCCTC GACGCCCTCG 
61981 CCACCCACCG CCACACCCTC GGCCAACCCG CCACCTCCAT CGCCTGGGGC ATGTGGCACA 
62041 CCACCAGCAC CCTCACCGGA CAACTCGACG ACGCCGACCG GGACCGCATC CGCCGCGGCG 
62101 GTTTCCTCCC GATCACGGAC GACGAGGGCA TGCGCCTCTA CGAGGCGGCC GTCGGCTCCG 
62161 GCGAGGACTT CGTCATGGCC GCCGCGATGG ACCCGGCACA GCCGATGACC GGCTCCGTAC 
62221 CGCCCATCCT GAGCGGCCTG CGCAGGAGCG CGCGGCGCGT CGCCCGTGCC GGGCAGACGT 
62281 TCGCCCAGCG GCTCGCCGAG CTGCCCGACG CCGACCGCGG CGCGGCGCTG ACCACCCTCG 
62341 TCTCGGACGC CACGGCCGCC GTGCTCGGCC ACGCCGACGC CTCCGAGATC GCGCCGACCA 
624 01 CGACGTTCAA GGACCTCGGC ATCGACTCGC TCACCGCGAT CGAGCTGCGC AACCGGCTCG 
624 61 CGGAGGCGAC CGGGCTGCGG CTGAGTGCCA CGCTGGTGTT CGACCACCCG ACACCTCGGG 
62521 TCCTCGCCGC CAAGCTCCGC ACCGATCTGT TCGGCACGGC CGTGCCCACG CCCGCGCGGA 
62581 CGGCACGGAC CCACCACGAC GAGCCACTCG CGATCGTCGG CATGGCGTGC CGACTGCCCG 
62 641 GCGGGGTCGC CTCGCCGGAG GACCTGTGGC AGCTCGTGGC GTCCGGCACC GACGCGATCA 
627 01 CCGAGTTCCC CACCGACCGC GGCTGGGACA TCGACCGGCT GTTCGACCCG GACCCGGACG 
627 61 CCCCCGGCAA GACCTACGTC CGGCACGGCG GCTTCCTCGC CGAGGCCGCC GGCTTCGATG 
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62821 CCGCGTTCTT CGGCATCAGC CCGCGCGAGG CACGGGCCAT GGACCCGCAG CAGCGCGTCA 
62881 TCCTCGAAAC CTCCTGGGAG GCGTTCGAGA ACGCGGGCAT CGTGCCGGAC ACGCTGCGCG 
62941 GCAGCGACAC CGGCGTGTTC ATGGGCGCGT TCTCCCATGG GTACGGCGCC GGCGTCGACC 
63001 TGGGCGGGTT CGGCGCCACC GCCACGCAGA ACAGCGTGCT CTCCGGCCGG TTGTCGTACT 
63061 TCTTCGGCAT GGAGGGCCCG GCCGTCACCG TCGACACCGC CTGCTCGTCG TCGCTGGTCG 
63121 CCCTGCACCA GGCGGCACAG GCGCTGCGGA CTGGAGAATG CTCGCTGGCG CTCGCCGGCG 
63181 GTGTCACGGT GATGCCCACC CCGCTGGGCT ACGTCGAGTT CTGCCGCCAG CGGGGACTCG 
63241 CCCCCGACGG CCGTTGCCAG GCCTTCGCGG AAGGCGCCGA CGGCACGAGC TTCTCGGAGG 
63301 GCGCCGGCGT TCTTGTGCTG GAGCGGCTCT CCGACGCCGA GCGCAACGGA CACACCGTCC 
63361 TCGCGGTCGT CCGCTCCTCC GCCGTCAACC AGGACGGCGC CTCCAACGGC ATCTCCGCAC 
63421 CCAACGGCCC CTCCCAGCAG CGCGTCATCC GCCAGGCCCT CGACAAGGCC GGGCTCGCCC 
634 81 CCGCCGACGT GGACGTGGTG GAGGCCCACG GCACCGGAAC CCCGCTGGGC GACCCGATCG 
63541 AGGCACAGGC CATCATCGCG ACCTACGGCC AGGACCGCGA CACACCGCTC TACCTCGGTT 
63601 CGGTCAAGTC GAACATCGGA CACACCCAGA CCACCGCCGG TGTCGCCGGC GTCATCAAGA 
63661 TGGTCATGGC GATGCGCCAC GGCATCGCGC CGAAGACACT GCACGTGGAC GAGCCGTCGT 
63721 CGCATGTGGA CTGGACCGAG GGTGCGGTGG AACTGCTCAC CGAGGCGAGG CCGTGGCCCG 
63781 ACGCGGGACG CCCGCGCCGC GCGGGCGTGT CGTCGCTCGG TATCAGCGGT ACGAACGCCC 
638 41 ACGTGATCCT TGAGGGTGTT CCCGGGCCGT CGCGTGTGGA GCCGTCTGTT GACGGGTTGG 
63901 TGCCGTTGCC GGTGTCGGCT CGGAGTGAGG CGAGTCTGCG GGGGCAGGTG GAGCGGCTGG 
63961 AGGGGTATCT GCGCGGGAGT GTGGATGTGG CCGCGGTCGC GCAGGGGTTG GTGCGTGAGC 
64 021 GTGCTGTCTT CGGTCACCGT GCGGTACTGC TGGGTGATGC CCGGGTGATG GGTGTGGCGG 
64 081 TGGATCAGCC GCGTACGGTG TTCGTCTTTC CCGGGCAGGG TGCTCAGTGG GTGGGCATGG 
64141 GTGTGGAGTT GATGGACCGT TCTGCGGTGT TCGCGGCTCG TATGGAGGAG TGTGCGCGGG 
64 201 CGTTGTTGCC GCACACGGGC TGGGATGTGC GGGAGATGTT GGCGCGGCCG GATGTGGCGG 
64 261 AGCGGGTGGA GGTGGTCCAG CCGGCCAGCT GGGCGGTCGC GGTCAGCCTG GCCGCACTGT 
64 321 GGCAGGCCCA CGGGGTCGTA CCCGACGCGG TGATCGGACA CTCCCAGGGC GAGATCGCGG 
64 381 CGGCGTGCGT GGCCGGGGCC CTCAGCCTTG AGGACGCCGC CCGCGTGGTG GCCTTGCGCA 
644 41 GCCAGGTCAT CGCGGCGCGA CTGGCCGGGC GGGGAGCGAT GGCTTCGGTG GCATTGCCGG 
64501 CCGGTGAGGT CGGTCTGGTC GAGGGCGTGT GGATCGCGGC GCGTAACGGC CCCGCCTCGA 
64561 CAGTCGTGGC CGGCGAGCCG TCGGCGGTGG AGGACGTGGT GACGCGGTAT GAGACCGAAG 
64 621 GCGTGCGAGT GCGTCGTATC GCCGTCGACT ACGCCTCCCA CACGCCCCAC GTGGAAGCCA 
64 681 TCGAGGACGA ACTCGCTGAG GTACTGAAGG GAGTTGCAGG GAAGGCCGCG TCGGTGGCGT 
64741 GGTGGTCGAC CGTGGACAGC GCCTGGGTGA CCGAGCCGGT GGATGAGAGT TACTGGTACC 
64801 GGAACCTGCG TCGCCCCGTC GCGCTGGACG CGGCGGTGGC GGAGCTGGAC GGGTCCGTGT 
64 861 TCGTGGAGTG CAGCGCCCAT CCGGTGCTGC TGCCGGCGAT GGAACAGGCC CACACGGTGG 
64921 CGTCGTTGCG CACCGGTGAC GGCGGCTGGG AGCGATGGCT GACGGCGTTG GCGCAGGCGT 
64 981 GGACCCTGGG CGCGGCAGTG GACTGGGACA CGGTGGTCGA ACCGGTGCCA GGGCGGCTGC 
65041 TCGATCTGCC CACCTACGCG TTCGAGCGCC GGCGCTACTG GCTGGAAGCG GCCGGTGCCA 
65101 CCGACCTGTC CGCGGCCGGG CTGACAGGGG CAGCACATCC CATGCTGGCC GCCATCACGG 
65161 CACTACCCGC CGACGACGGT GGTGTTGTTC TCACCGGCCG GATCTCGTTG CGCACGCATC 
65221 CCTGGCTGGC TGATCACGCG GTGCGGGGCA CGGTCCTGCT GCCGGGCACG GCCTTTGTGG 
65281 AGCTGGTCAT CCGGGCCGGT GAG G AG AC CG GTTGCGGGAT AGTGGATGAA CTGGTCATCG 
65341 AATCCCCCCT CGTGGTGCCG GCGACCGCAG CCGTGGATCT GTCGGTGACC GTGGAAGGAG 
654 01 CTGACGAGGC CGGACGGCGG CGAGTGACCG TCCACGCCCG CACCGAAGGC ACCGGCAGCT 
65461 GGACCCGGCA CGCCAGCGGC ACCCTGACCC CCGACACCCC CGACACCCCC AACGCTTCCG 
65521 GTGTTGTCGG TGCGGAGCCG TTCTCGCAGT GGCCACCTGC CACTGCCGCG GCCGTCGACA 
65581 CCTCGGAGTT CTACTTGCGC CTGGACGCGC TGGGCTACCG GTTCGGACCC ATGTTCCGCG 
65641 GAATGCGGGC TGCCTGGCGT GATGGTGACA CCGTGTACGC CGAGGTCGCG CTCCCCGAGG 
65701 ACCGTGCCGC CGACGCGGAC GGTTTCGGCA TGCACCCGGC GCTGCTCGAC GCGGCCTTGC 
657 61 AGAGCGGCAG CCTGCTCATG CTGGAATCGG ACGGCGAGCA GAGCGTGCAA CTGCCGTTCT 
65821 CCTGGCACGG CGTCCGGTTC CACGCGACGG GCGCGACCAT GCTGCGGGTG GCGGTCGTAC 
65881 CGGGCCCGGA CGGCCTCCGG CTGCATGCCG CGGACAGCGG GAACCGTCCC GTCGCGACGA 
65941 TCGACGCGCT CGTGACCCGG TCCCCGGAAG CGGACCTCGC GCCCGCCGAT CCGATGCTGC 
66001 GGGTCGGGTG GGCCCCGGTG CCCGTACCTG CCGGGGCCGG TCCGTCCGAC GCGGACGTGC 
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66061 TGACGCTGCG CGGCGACGAC GCCGACCCGC TCGGGGAGAC CCGGGACCTG ACCACCCGTG 
66121 TTCTCGACGC GCTGCTCCGG GCCGACCGGC CGGTGATCTT CCAGGTGACC GGTGGCCTCG 
66181 CCGCCAAGGC GGCCGCAGGC CTGGTCCGCA CCGCTCAGAA CGAGCAGCCC GGCCGCTTCT 
66241 TCCTCGTCGA AACGGACCCG GGAGAGGTCC TGGACGGCGC GAAGCGCGAC GCGATCGCGG 
5 66301 CACTCGGCGA GCCCCATGTG CGGCTGCGCG ACGGCCTCTT CGAGGCAGCC CGGCTGATGC 
66361 GGGCCACGCC GTCCCTGACG CTCCCGGACA CCGGGTCGTG GCAGCTGCGG CCGTCCGCCA 
66421 CCGGTTCCCT CGACGACCTT GCCGTCGTCC CCACCGACGC CCCGGACCGG CCGCTCGCGG 
664 81 CCGGCGAGGT GCGGATCGCG GTACGCGCGG CGGGCCTGAA CTTCCGGGAT GTCACGGTCG 
66541 CGCTCGGTGT GGTCGCCGAT GCGCGTCCGC TCGGCAGCGA GGCCGCGGGT GTCGTCCTGG 
10 66601 AGACCGGCCC CGGTGTGCAC GACCTGGCGC CCGGCGACCG GGTCCTGGGG ATGCTCGCGG 
66661 GCGCCTTCGG ACCGGTCGCG ATCACCGACC GGCGGCTGCT CGGCCGGATG CCGGACGGCT 
66721 GGACGTTCCC GCAGGCGGCG TCCGTGATGA CCGCGTTCGC GACCGCGTGG TACGGCCTGG 
66781 TCGACCTGGC CGGGCTGCGC CCCGGCGAGA AGGTCCTGAT CCACGCGGCG GCGACCGGTG 
66841 TCGGCGCGGC GGCCGTCCAG ATCGCGCGGC ATCTGGGCGC GGAGGTGTAC GCGACCACCA 
15 66901 GCGCCGCGAA GCGCCATCTG GTGGACCTGG ACGGAGCGCA TCTGGCCGAT TCCCGCAGCA 
66961 CCGCGTTCGC CGACGCGTTC CCGCCGGTCG ATGTCGTGCT CAACTCGCTC ACCGGTGAAT 
67021 TCCTCGACGC GTCCGTCGGC CTGCTCGCGG CGGGTGGCCG GTTCATCGAG ATGGGGAAGA 
67081 CGGACATCCG GCACGCCGTC CAGCAGCCGT TCGACCTGAT GGACGCCGGC CCCGACCGGA 
U nn 67141 TGCAGCGGAT CATCGTCGAG CTGCTCGGCC TGTTCGCGCG CGACGTGCTG CACCCGCTGC 
^0 20 67201 CGGTCCACGC CTGGGACGTG CGGCAGGCGC GGGAGGCGTT CGGCTGGATG AGCAGCGGGC 
yQ 672 61 GTCACACCGG CAAGCTGGTG CTGACGGTCC CGCGGCCGCT GGATCCCGAG GGGGCCGTCG 

£ 67321 TCATCACCGG CGGCTCCGGC ACCCTCGCCG GCATCCTCGC CCGCCACCTG GGCCACCCCC 

Q 67381 ACACCTACCT GCTCTCCCGC ACCCCACCCC CCGACACCAC CCCCGGCACC CACCTCCCCT 

hi 67441 GCGACGTCGG CGACCCCCAC CAACTCGCCA CCACCCTCGC CCGCATCCCC CAACCCCTCA 

£: 25 67501 CCGCCGTCTT CCACACCGCC GGAACCCTCG ACGACGCCCT GCTCGACAAC CTCACCCCCG 
^ 675 61 ACCGCGTCGA CACCGTCCTC AAACCCAAGG CCGACGCCGC CTGGCACCTG CACCGGCTCA 

« ! 67 621 CCCGCGACAC CGACCTCGCC GCGTTCGTCG TCTACTCCGC GGTCGCCGGC CTCATGGGCA 

^ 67681 GCCCGGGGCA GGGCAACTAC GTCGCGGCGA ACGCGTTCCT CGACGCGCTC GCCGAACACC 

H , A 67741 GCCGTGCGCA AGGGCTGCCC GCGCAGTCCC TCGCATGGGG CATGTGGGCG GACGTCAGCG 
ttl 30 67801 CGCTCACCGC GAAACTCACC GACGCGGACC GCCAGCGCAT CCGGCGCAGC GGATTCCCGC 
f]J 678 61 CGTTGAGCGC CGCGGACGGC ATGCGGCTGT TCGACGCGGC GACGCGTACC CCGGAACCGG 

XJ 67921 TCGTCGTCGC GACGACCGTC GACCTCACCC AGCTCGACGG CGCCGTCGCG CCGTTGCTCC 

C3 67981 GCGGTCTGGC CGCGCACCGG GCCGGGCCGG CGCGCACGGT CGCCCGCAAC GCCGGCGAAG 

u 68041 AGCCCCTGGC CGTGCGTCTT GCCGGGCGTA CCGCCGCCGA GCAGCGGCGC ATCATGCAGG 

35 68101 AGGTCGTGCT CCGCCACGCG GCCGCGGTCC TCGCGTACGG GCTGGGCGAC CGCGTGGCGG 
68161 CGGACCGTCC GTTCCGCGAG CTCGGTTTCG ATTCGCTGAC CGCGGTCGAC CTGCGCAATC 
68221 GGCTCGCGGC CGAGACGGGG CTGCGGCTGC CGACGACGCT GGTGTTCAGC CACCCGACGG 
68281 CGGAGGCGCT CACCGCCCAC CTGCTCGACC TGATCGACGC TCCCACCGCC CGGATCGCCG 
68341 GGGAGTCCCT GCCCGCGGTG ACGGCCGCTC CCGTGGCGGC CGCGCGGGAC CAGGACGAGC 
40 684 01 CGATCGCCAT CGTGGCGATG GCGTGCCGGC TGCCCGGTGG TGTGACGTCG CCCGAGGACC 
684 61 TGTGGCGGCT CGTCGAGTCC GGCACCGACG CGATCACCAC GCCTCCTGAC GACCGCGGCT 
68521 GGGACGTCGA CGCGCTGTAC GACGCGGACC CGGACGCGGC CGGCAAGGCG TACAACCTGC 
68581 GGGGCGGTTA CCTGGCCGGG GCGGCGGAGT TCGACGCGGC GTTCTTCGAC ATCAGTCCGC 
68 641 GCGAAGCGCT CGGCATGGAC CCGCAGCAAC GCCTGCTGCT CGAAACGGCG TGGGAGGCGA 
45 687 01 TCGAGCGCGG CCGGATCAGT CCGGCGTCGC TCCGCGGCCG GGAGGTCGGC GTCTATGTCG 
687 61 GTGCGGCCGC GCAGGGCTAC GGGCTGGGCG CCGAGGACAC CGAGGGCCAC GCGATCACCG 
68821 GTGGTTCCAC GAGCCTGCTG TCCGGACGGC TGGCGTACGT GCTCGGGCTG GAGGGCCCGG 
68881 CGGTCACCGT GGACACGGCG TGCTCGTCGT CTCTGGTCGC GCTGCATCTG GCGTGCCAGG 
68941 GGCTGCGCCT GGGCGAGTGC GAACTCGCTC TGGCCGGAGG GGTCTCCGTA CTGAGTTCGC 
5U 69001 CGGCCGCGTT CGTGGAGTTC TCCCGCCAGC GCGGGCTCGC GGCCGACGGG CGCTGCAAGT 
69061 CGTTCGGCGC GGGCGCGGAC GGCACGACGT GGTCCGAGGG CGTGGGCGTG CTCGTACTGG 
69121 AACGGCTCTC CGACGCCGAG CGGCTCGGGC ACACCGTGCT CGCCGTCGTC CGCGGCAGCG 
69181 CCGTCACGTC CGACGGCGCC TCCAACGGCC TCACCGCGCC GAACGGGCTC TCGCAGCAGC 
69241 GGGTCATCCG GAAGGCGCTC GCCGCGGCCG GGCTGACCGG CGCCGACGTG GACGTCGTCG 
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69301 AGGGGCACGG CACCGGCACC CGGCTCGGCG 
69361 CGTACGGGCA GGACCGTCCG GCACCGGTCT 
69421 ATGCCACGGC CGCGGCCGGT GTCGCGGGCG 
694 81 GCACGATGCC GCGGACGCTG CATGTGGAGG 
69541 GACAGGTGTC CCTGCTCGGC TCCAACCGGC 
69601 CGGCCGTCTC CGCGTTCGGG CTCAGCGGGA 
69661 GTCCGGCGCC CGTGGCGTCC CAGCCGCCCC 
69721 CGTGGGTGCT CTCCGCGCGG ACTCCGGCCG 
697 81 ACCACCTCGC GGCGGCACCG GACGCGGATC 
69841 GCCGCGCCCA GTTCGCCCAC CGTGCCGCGG 
69901 CCGCGCTCGA CGGCCTCGCG GACGGCGCGG 
69961 AGGAGCGGCG CGTCGCCTTC CTCTTCGACG 
7 0021 GCGAGCTCCA CCGCCGGTTC CCCGTCTTCG 
70081 TCGGCAAGCA CCTCAAGCAC TCCCCCACGG 
7 0141 CCCATGACAC CCTGTACGCC CAGGCCGGCC 
7 0201 TGCTGGAGCA CTGGGGGGTG CGGCCGGACG 
70261 CCGCGGCGTA CGCGGCGGGG GTGCTCACCC 
7 0321 GGGGGCGGGC GCTGCGGGCG CTGCCGCCCG 
70381 CGGAGGTCGG CGCCCGCACG GATCTGGACA 
70441 TGCTCGCCGG TTCGCCGGAC GATGTGGCGG 
7 0501 GGCGCACGAA ACGGCTCGAC GTCGGGCACG 
70561 TCGACGGCTT CCGTACGGTG CTGGAGTCGC 
7 0621 TGTCCACGAC GACGGGCCGG GACGCCGCGG 
70681 GCCATGCGCG TCGGCCGGTG CTGTTCTCGG 

707 41 TCACCACGTT CGTGGCCGTC GGCCCCTCCG 
7 0801 CCGGGGAGGA CGCCGGGACC TACCACGCGG 

708 61 CGGCGCTGAC CGCCCTCGCC GAGCTGCACG 
7 0921 TACTGGCCGG TGGCCGGCCA GTGGACCTTC 
70981 GGCTGGCCCC GGCCGTGGCG GGGGCGCCGG 
71041 AGTCCGAGCC GGAGGACCTC ACCGTCGCCG 
71101 TCGGCGTCAC GGACCCCGCC GACGTCGATG 
71161 ACTCACTGGC GGTGCAGCGG CTGCGCAACC 
71221 CGGCGGCCGT CCTGTTCGAC CACGACACCC 
71281 GGATCGAGGC CGGCCAGGAC CGGATCGAGG 
71341 TCTCGCTCCT GGAGGAGATG GAGTCGCTCG 
714 01 CGGAGCGTGC GGCCATCGCC GATCTGCTCG 
714 61 GATGAGCACC GATACGCACG AGGGAACGCC 
71521 GGACGGTCAC CGCGCCATCC TGGAGAGCGG 
71581 CAAGCACTGG CTGGTCGCCG CCGCCGAGGA 
71641 CAGCTCGGCC GCGCCGTCCG AGATGCTGCC 
71701 GGACTCACCG GAGCACAACC GCTACCGGCA 

717 61 GGCGCGCAAG CGGGAGGACT TCGTCGCCGA 
71821 GGCCGCGGGA CCCGGCACCG ACCTCATCCC 

718 81 CATCAACGCG CTGTACGGGC TCACCCCTGA 
71941 CGACATCACC GGCTCGGCCG ATCTGGACAG 
72001 GCACGCGCTG CGGCTGGTCC GCGCGAAGCG 
72061 GCTGGCCTCG GCCGACGACG GCGAGATCTC 
72121 CGCGACGCTG CTGTTCGCCG GCCACGACTC 
72181 CGCACTGCTC AGCCACCCCG AGCAGCAGGC 
72241 CAACGCGGTC GAGGAGATGC TCCGTTTCCT 
72301 CTGTGTCGAG GACGTCGATG TGCGGGGCGT 
72361 GCTCTACTCG ACGGCCAACC GCGACCCCGA 
72421 GACGCGCCCG CTGGAGGGCA ACTTCGCGTT 
724 81 GCACATCGCC CGGGTGCTCA TCAAGGTCGC 



ACCCGGTCGA GGCGGACGCG CTGCTCGCGA 
GGCTGGGCTC GCTGAAGTCG AACATCGGAC 
TCATCAAGAT GGTGCAGGCG ATCGGCGCGG 
AGCCCTCGCC CGCCGTCGAC TGGAGCACCG 
CCTGGCCGGA CGACGAGCGT CCGCGCCGGG 
CGAACGCGCA CGTCATCCTG GAACAGCACC 
GGCCGCCCCG TGAGGAGTCC CAGCCGCTGC 
CGCTGCGGGC CCAGGCGGCC CGGCTGCGCG 
CGTTGGACAT CGGGTACGCG CTGGCCACCA 
TCGTCGCCAC CACCCCGGAC GGATTCCGTG 
AGGCGCCCGG AGTCGTCACC GGGACCGCTC 
GCCAGGGCGC CCAGCGCGCC GGAATGGGGC 
CCGCCGCGTG GGACGAGGTC TCCGACGCGT 
ACGTCTACCA CGGCGAACAC GGCGCTCTCG 
TGTTCACGCT CGAAGTGGCG CTGCTGCGGC 
TGCTCGTCGG GCACTCCGTC GGCGAGGTGA 
TGGCGGACGC GACGGAGTTG ATCGTGGCCC 
GGGCGATGCT CGCCGTCGAC GGAAGCCCGG 
TCGCCGCGGT CAACGGCCCG TCCGCCGTGG 
CGTTCGAACG GGAGTGGTCG GCGGCCGGGC 
CGTTCCACTC CCGGCACGTC GACGGTGCGC 
TCGCGTTCGG CGCGGCGCGG CTGCCGGTGG 
ACGACCTCAT AACGCCCGCG CACTGGCTGC 
ATGCCGTCCG GGAGCTGGCC GACCGCGGCG 
GCTCCCTGGC GTCGGCCGCG GCGGAGAGCG 
TGCTGCGCGC CCGGACCGGT GAGGAGACCG 
CCCACGGCGT CCCGGTCGAC CTGGCCGCGG 
CCGTGTACGC GTTCCAGCAC CGTTCCTACT 
CCACCGTGGC GGACACCGGG GGTCCGGCGG 
AGATCGTCCG TCGGCGCACC GCGGCGCTGC 
CGGAAGCGAC GTTCTTCGCG CTCGGTTTCG 
AGCTCGCCTC GGCAACCGGG CTGGACCTGC 
CGGCCGCGCT CACCGCGTTC CTCCAGGACC 
CCGGCGAGGA CGACGACGCG CCCACCGTGC 
ACGCCGCGGA CATCGCGGCG ACGCCGGCCC 
ACAAGCTCGC CCATACCTGG AAGGACTACC 
GCCCGCCGGC CGCTGCCCAT TCGCGATCCA 
CACGGTGGGT TCGTTCGACC TGTTCGGCGT 
CGTCAAGCTG GTCACCAACG ATCCGCGGTT 
CGACCGGCGG CCCGGCTGGT TCTCCGGGAT 
GAAGATCGCG GGGGACTTCA CACTGCGCGC 
GGCCGCCGAC GCCTGCCTGG ACGACATCGA 
CGGGTACGCC AAGCGGCTGC CCTCCCTCGT 
GGAGGGGGCC GTGCTGGAGG CACGGATGCG 
CGTCAAGACG CTGACCGACG ACTTCTTCGG 
TGACGAGCGG GGCGAGGACC TGCTGCACCG 
GCTCAGCGAC GACGAGGCGA CGGGCGTGTT 
GGTGCAGCAG ATGGTCGGCT ACTGCCTCTA 
GGCGCTGCGC GCGCGCCCGG AGCTGGTCGA 
GCCCGTCAAC CAGAT GGGCG TACCGCGCGT 
GCGCATCCGT GCGGGCGACA ACGTGATCCC 
GGTGTTCCCG CAGCCCGACA CCTTCGATGT 
CGGCCACGGC ATTCACAAGT GTCCCGGCCA 
CTGCCTGCGG TTGTTCGAGC GTTTCCCGGA 
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7 2541 CGTCCGGCTG GCCGGCGACG TGCCGATGAA CGAGGGGCTC GGGCTGTTCA GCCCGGCCGA 
72 601 GCTGCGGGTC ACCTGGGGGG CGGCATGAGT CACCCGGTGG AGACGTTGCG GTTGCCGAAC 
72661 GGGACGACGG TCGCGCACAT CAACGCGGGC GAGGCGCAGT TCCTCTACCG GGAGATCTTC 
72721 ACCCAGCGCT GCTACCTGCG CCACGGTGTC GACCTGCGCC CGGGGGACGT GGTGTTCGAC 
5 72781 GTCGGCGCGA ACATCGGCAT GTTCACGCTT TTCGCGCATC TGGAGTGTCC TGGTGTGACC 
72841 GTGCACGCCT TCGAGCCCGC GCCCGTGCCG TTCGCGGCGC TGCGGGCGAA CGTGACGCGG 
72901 CACGGCATCC CGGGCCAGGC GGACCAGTGC GCGGTCTCCG ACAGCTCCGG CACCCGGAAG 
72 961 ATGACCTTCT ATCCCGACGC CACGCTGATG TCCGGTTTCC ACGCGGATGC CGCGGCCCGG 
73021 ACGGAGCTGT TGCGCACGCT CGGCCTCAAC GGCGGCTACA CCGCCGAGGA CGTCGACACC 
10 73081 ATGCTCGCGC AACTGCCCGA CGTCAGCGAG GAGATCGAAA CCCCTGTGGT CCGGCTCTCC 
7 3141 GACGTCATCG CGGAGCGCGG TATCGAGGCC ATCGGCCTGC TGAAGGTCGA CGTGGAGAAG 
7 3201 AGCGAACGGC AGGTCTTCGC CGGCCTCGAG GACACCG ACT GGCCCCGTAT CCGCCAGGTC 
7 32 61 GTCGCGGAGG TCCACGACAT CGACGGCGCG CTCGAGGAGG TCGTCACGCT GCTCCGCGGC 
7 3321 CATGGCTTCA CCGTGGTCGC CGAGCAGGAA CCGCTGTTCG CCGGCACGGG CATCCACCAG 
15 73381 GTCGCCGCGC GGCGGGTGGC CGGCTGAGCG CCGTCGGGGC CGCGGCCGTC CGCACCGGCG 
73441 GCCGCGGTGC GGACGGCGGC TCAGCCGGCG TCGGACAGTT CCTTGGGCAG TTGCTGACGG 
73501 CCCTTCACCC CCAGCTTGCG GAACACGTTG GTGAGGTGCT GTTCCACCGT GCTGGAGGTG 
7 3561 ACGAACAGCT GGCTGGCGAT CTCCTTGTTG GTGCGCCCGA CCGCGGCGTG CGACGCCACC 
O 7 3621 CGCCGCTCCG CCTCGGTCAG CGATGTGATC CGCTGCGCCG GCGTCACGTC CTGGGTGCCG 

20 7 3681 TCCGCGTCCG AGGACTCCCC ACCGAGCCGC CGGAGGAGCG GCACGGCTCC GCACTGGGTC 
73741 GCGAGGTGCC GTGCGCGGCG GAACAGTCCC CGCGCACGGC TGTGCCGCCG GAGCATGCCG 
jg 7 3801 CACGCTTCGC CCATGTCGGC GAGGACGCGG GCCAGCTCGT ACTGGTCGCG GCACATGATG 

S 7 38 61 AGCAGATCGG CGGCCTCGTC GAGCAGTTCG ATCCGCTTGG CCGGCGGACT GTAGGCCGCC 

*** 7 3921 TGCACCCGCA GCGTCATCAC CCGCGCCCGG GACCCCATCG GCCGGGACAG CTGCTCGGAG 

25 7 3981 ATGAGCCTCA GCCCCTCGTC ACGGCCGCGG CCGAGCAGCA GAAGCGCTTC GGCGGCGTCG 
^ 7 4 041 ACCCGCCACA GGGCCAGGCC CGGCACGTCG ACGGACCAGC GTCGCATCCG CTCCCCGCAG 

$ l 74101 TCCCGGAACG CGTTGTACGC CGCCCGGTAC CGCCCGGCCG CGAGATGGTG TTGCCCACGG 

■* 74161 GCCCAGACCA TGTGCAGTCC GAAGAGGCTG TCGGAGGTCT CCTCCGGCAA CGGCTCGGCG 

O 7 4221 AGCCACCGCT CCGCCCGGTC CAGGTCGCCC AGTCGGATCG CGGCGGCCAC GGTGCTGCTC 

59 30 7 4281 AGCGGCAATG CGGCGGCCAT CCCCCAGGAG GGCACGACCC GGGGGGCGAG CGCGGCCTCG 
Hi 7 4 341 CCGCATTCGA CGGCGGCGGT CAGGTCGCCG CGGCGCAGCG CGGCCTCGGC GCGGAACCCC 

%} 74 4 01 GCGTGGACCG CCTCGTCGGC CGGGGTCCGC ATGTTGTCGT CACCGGCCAG CTTGTCGACC 

^ 74 4 61 CAGGACTGGA CGGCATCGGT GTCCTCGGCG TAGAGCAGGG CCAGCAACGC CATCATGGTC 

^ 7 4521 GTGGTCCGGT CCGTCGTGAC CCGGGAGTGC TGGAGCACGT ACTCGGCTTT GGCCTCGGCC 

- 35 74581 TGTTCGGACC AGCCGCGCAG CGCGTTGCTC AGGGCCTTGT CGGCGACGGC GCGGTGCCGG 
7 4 641 ACGGCTCCGG AAAACGAGGC GACCTCGTCC TCGGCCGGCG GATCGGCCGG ACGCGGCGGA 
7 4 701 TCGGCCGCGC CGGGATAGAT CAGCGCGAGG GACAGGTCCG CGACGCGCAG GTGCGCCCGG 
7 4761 CCCTGCTCGC TCGGGGCGGC GGAGCGCTGG GCCGCCAGGA CCTCGGCGGC CTCGCCCGGC 
7 4 821 CGCCCGTCCA TCGCCAGCCA GCAGGCGAGC GACACGGCGT GCTCGCTGGA GAGGAGCCGT 
40 74 881 TCCCGCGACG CGGTGAGCAG CTCGGGCACA TGCCGGCCGG ATCTGGCGGG ATCGCAGAGC 
7 4 941 CGCTCGATGG CGGCGGTGTC GACGCGCAGT GCGGCGTGGA CGGCGGGGTC GTCGGAGGCC 
7 5001 CGGTAGGCGA ACTCCAGGTA GGTGACGGCC TCGTCGAGCT CGCCGCGCAG GTGGTGCTCG 
75061 CGCGCGGCGT CGGTGAACAG CCCGGCGACC TCGGCGCCGT GCACCCGGCC GGTACCCATC 
7 5121 TGGTGGCGGG CGAGCACCTT GCTGGCCACG CCGCGGTCCC GCAGCAGTTC CAGCGCCAGC 
45 7 5181 TCGTGCAGGC CACGCCGCTC GGCGGCGGAG AGGTCGTCGA GTACGACGGA GCGGGCCGCG 
7 5241 GGGTGCGGGA ACCGCCCTTC CCGCAGCAGC CGCCCCTCGA CCAGCTGTTC GTGGGCCTGC 
7 5301 TCGACCGCCT CGGTGTCGAG GCCGGTCATC CGCTGGACGA GGGTGAGTTC GACACTCTCG 
75361 CCGAGCACGG CGGAAGCTCG GGCGACGCTC AGCGCGGCCG GGCCGCAACG ATAGAGCGAC 
754 21 CCGAGGTAGG CGAGCCGGTA CGCCCGCCCC GCGACCACTT CCAGGCACCC TGAGGTCCGT 
50 7 5481 GTCCGTGCCT CCCGGATGTC GTCGATCAGG CCGTGGCCGA GGAGCAGGTT GCCGCCGGTC 
7 5541 GCCCGGAACG CCTGGGCCAC CACGTCGTCG TGCGCGTCCT GGCCGAGGTG CCGGCGCACG 
75601 AGTTCGGTGG TCTGCGCCTC GGTGAGCGGG CGCAGCGCGA TCTCCTGGTA GTGGCGCAGA 
75661 CTCAGCAGTG CCGCCCGGAA TTGGGAGTGG GCGGGCGTCG GCCGGAGCAG CTCGGTCAGC 
7 5721 ACGATGGCGA CACGGGCCCG GCTGATGCGG CGCGCGAGGT GGAGCAGGCA GCGCAGCGAC 
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757 81 GGCGCGTCGG CGTGGTGCAC GTCGTCGATG CCGATCAGTA CGGGCCGCTC CGCGGCGAGC 
7 5841 GTCAGCACCG TGCGGGTGAG TTCGGTCCCC AGGCGGTTGT CGACGTCGGC CGGCAGGTTT 
75901 TCGCACGATG CCGTCAGCCG GACCAGCTCC GGTGTCCGGG CGGCCAGCTC GGGCTGGTCG 
75961 AGGAGCTGGC CGAGCATGCC GTACGGCAGG GCCCGCTCCT CCATGGAGCA CACCGCGCGA 
7 6021 AGGGTGACGA AGCCGGCCTT GGCCGCGGCG GCGTCGAGGA GTTCGGTCTT GCCGCAGGCG 
7 6081 ATCGGCCCGG TGACGGCGGC GACGACGCCC CGCCCGCCCC CCGCTCGGGT GAGCGCCCGG 
7 6141 TGGAGGGAAC CGAACTCGTC ATCGCGGGCG ATCAGGTCTG GGGGAGATAA GCGCGCTATC 
7 6201 ACGAATGGAA CTACCTCGCG ACCGTCGTGG AAACCCATAG GCATCACATG GCTTGTTGAT 
7 6261 CTGTACGGCT GTGATTCAGC CTGGCGGGAT GCTGTGCTAC AGATGGGAAG ATGTGATCTA 
7 6321 GGGCCGTGCC GTTCCCTCAG GAG C CG AC CG CCCCCGGCGC CACCCGCCGT ACCCCCTGGG 
7 6381 CCACCAGCTC GGCGACCCGC TCCTGGTGGT CGACGAGGTA GAAGTGCCCG CCGGGGAAGA 
7 6441 CCTCCACCGT GGTCGGCGCG GTCGTGTGCC CGGCCCAGGC GTGGGCCTGC TCCACCGTCG 
7 6501 TCTTCGGATC GTCGTCACCG ATGCACACCG TGATCGGCGT CTCCAGCGGC GGCGCGGGCT 
7 6561 CCCACCGGTA CGTCTCCGCC GCGTAGTAGT CCGCCCGCAA CGGCGCCAGG ATCAGCGCGC 
7 6621 GCATTTCGTC GTCCGCCATC ACATCGGCGC TCGTCCCGCC GAGGCCGATG ACCGCCGCCA 
7 6681 GCAGCTCGTC GTCGGACGCG AGGTGGTCCT GGTCGGCGCG CGGCTGCGAC GGCGCCCGCC 
7 6741 GGCCCGAGAC GATCAGGTGC GCCACCGGGA GCCGCTGGGC CAGCTCGAAC GCGAGTGTCG 
76801 CGCCCATGCT GTGGCCGAAC AGCACCAGCG GACGGTCCAG CCCCGGCTTC AACGCCTCGG 
7 6861 CCACGAGGCC GGCGAGAACA CGCAGGTCGC GCACCGCCTC CTCGTCGCGG CGGTCCTGGC 
76921 GGCCGGGGTA CTGCACGGCG TACACGTCCG CCACCGGGGC GAGCGCACGG GCCAGCGGAA 
7 6981 GGTAGAACGT CGCCGATCCG CCGGCGTGGG GCAGCAGCAC CACCCGTACC GGGGCCTCGG 
7 7 041 GCGTGGGGAA GAACTGCCGC AGCCAGAGTT CCGAGCTCAC CGCACCCCCT CGGCCGCGAC 
77101 CTGGGGAGCC CGGAACCGGG TGATCTCGGC CAAGTGCTTC TCCCGCATCT CCGGGTCGGT 
77161 CACGCCCCAT CCCTCCTCCG GCGCCAGACA GAGGACGCCG ACTTTGCCGT TGTGCACATT 
7 7221 GCGATGCACA TCGCGCACCG CCGACCCGAC GTCGTCGAGC GGGTAGGTCA CCGACAGCGT 
77281 CGGGTGCACC ATCCCCTTGC AGATCAGGCG GTTCGCCTCC CACGCCTCAC GATAGTTCGC 
77341 GAAGTGGGTA CCGATGATCC GCTTCACGGA CATCCACAGG TACCGATTGT CAAAGGCGTG 
774 01 CTCGTATCCC GAGGTTGACG CGCAGGTGAC GATCGTGCCA CCCCGACGTG TCACGTAGAC 
77 4 61 ACTCGCGCCG AACGTCGCGC GCCCCGGGTG CTCGAACACG ATGTCGGGAT CGTCACCGCC 
77 521 GGTCAGCTCC CGGATC 



Those of skill in the art will recognize that, due to the degenerate nature of the 
genetic code, a variety of DNA compounds differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native DNA sequence 
encoding the FK-520 PKS of Streptomyces hygroscopicus is shown herein merely to 
illustrate a preferred embodiment of the invention, and the present invention includes 
DNA compounds of any sequence that encode the amino acid sequences of the 
polypeptides and proteins of the invention. In similar fashion, a polypeptide can typically 
tolerate one or more amino acid substitutions, deletions, and insertions in its amino acid 
sequence without loss or significant loss of a desired activity. The present invention 
includes such polypeptides with alternate amino acid sequences, and the amino acid 
sequences shown merely illustrate preferred embodiments of the invention. 
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The recombinant nucleic acids, proteins, and peptides of the invention are many 
and diverse. To facilitate an understanding of the invention and the diverse compounds 
and methods provided thereby, the following general description of the FK-520 PKS 
genes and modules of the PKS proteins encoded thereby is provided. This general 
description is followed by a more detailed description of the various domains and 
modules of the FK-520 PKS contained in and encoded by the compounds of the 
invention. In this description, reference to a heterologous PKS refers to any PKS other 
than the FK-520 PKS. Unless otherwise indicated, reference to a PKS includes reference 
to a portion of a PKS. Moreover, reference to a domain, module, or PKS includes 
reference to the nucleic acids encoding the same and vice-versa, because the methods and 
reagents of the invention provide or enable one to prepare proteins and the nucleic acids 
that encode them. 

The FK-520 PKS is composed of three proteins encoded by three genes 
designatedjkbA,flcbB, and JkbC. The JkbA ORF encodes extender modules 7-10 of the 
PKS. The flcbB ORF encodes the loading module (the CoA ligase) and extender modules 
1 - 4 of the PKS. The fkbC ORF encodes extender modules 5 - 6 of the PKS. The fkbP 
ORF encodes the NRPS that attaches the pipecolic acid and cyclizes the FK-520 
polyketide. 

The loading module of the FK-520 PKS includes a CoA ligase, an ER domain, 
and an ACP domain. The starter building block or unit for FK-520 is believed to be a 
dihydroxycyclohexene carboxylic acid, which is derived from shikimate. The 
recombinant DNA compounds of the invention that encode the loading module of the 
FK-520 PKS and the corresponding polypeptides encoded thereby are useful for a variety 
of methods and in a variety of compounds. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 loading module is inserted into a DNA 
compound that comprises the coding sequence for a heterologous PKS. The resulting 
construct, in which the coding sequence for the loading module of the heterologous PKS 
is replaced by the coding sequence for the FK-520 loading module, provides a novel PKS 
coding sequence. Examples of heterologous PKS coding sequences include the 
rapamycin, FK-506, rifamycin, and avermectin PKS coding sequences. In another 
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embodiment, a DNA compound comprising a sequence that encodes the FK-520 loading 
module is inserted into a DNA compound that comprises the coding sequence for the FK- 
520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the loading module coding sequence is 
utilized in conjunction with a heterologous coding sequence. In this embodiment, the 
invention provides, for example, either replacing the CoA ligase with a different CoA 
ligase, deleting the ER, or replacing the ER with a different ER. In addition, or 
alternatively, the ACP can be replaced by another ACP. In similar fashion, the 
corresponding domains in another loading or extender module can be replaced by one or 
more domains of the FK-520 PKS. The resulting heterologous loading module coding 
sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. 

The first extender module of the FK-520 PKS includes a KS domain, an AT 
domain specific for methylmalonyl CoA, a DH domain, a KR domain, and an ACP 
domain. The recombinant DNA compounds of the invention that encode the first 
extender module of the FK-520 PKS and the corresponding polypeptides encoded 
thereby are useful for a variety of applications. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 first extender module is inserted into a 
DNA compound that comprises the coding sequence for a heterologous PKS. The 
resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the first extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for modules of the heterologous PKS, provides a novel 
PKS coding sequence. In another embodiment, a DNA compound comprising a sequence 
that encodes the first extender module of the FK-520 PKS is inserted into a DNA 
compound that comprises the remainder of the coding sequence for the FK-520 PKS or a 
recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, all or only a portion of the first extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
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hydroxymalonyl CoA specific AT; deleting either the DH or KR or both; replacing the 
DH or KR or both with another DH or KR; and/or inserting an ER. In replacing or 
inserting KR, DH, and ER domains, it is often beneficial to replace the existing KR, DH, 
and ER domains with the complete set of domains desired from another module. Thus, if 
one desires to insert an ER domain, one may simply replace the existing KR and DH 
domains with a KR, DH, and ER set of domains from a module containing such domains. 
In addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a gene for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous first extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the first 
extender module of the FK-520 PKS. 

In an illustrative embodiment of this aspect of the invention, the invention 
provides recombinant PKSs and recombinant DNA compounds and vectors that encode 
such PKSs in which the KS domain of the first extender module has been inactivated. 
Such constructs are especially useful when placed in translational reading frame with the 
remaining modules and domains of an FK-520 or FK-520 derivative PKS. The utility of 
these constructs is that host cells expressing, or cell free extracts containing, the PKS 
encoded thereby can be fed or supplied with N-acylcysteamine thioesters of novel 
precursor molecules to prepare FK-520 derivatives. See U.S. patent application Serial 
No. 60/117,384, filed 27 Jan. 1999, and PCT patent publication Nos. US97/02358 and 
US99/03986, each of which is incorporated herein by reference. 

The second extender module of the FK-520 PKS includes a KS, an AT specific 
for methylmalonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA 
compounds of the invention that encode the second extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
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the FK-520 second extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the second 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the second extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the second extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous second extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the second extender module of the FK-520 PKS. 

The third extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the third extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
third extender module is inserted into a DNA compound that comprises the coding 
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sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the third extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the third extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the third extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous third extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the third extender module of the FK-520 PKS. 

The fourth extender module of the FK-520 PKS includes a KS, an AT that binds 
ethylmalonyl CoA, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the fourth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
fourth extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
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for a module of the heterologous PKS is either replaced by that for the fourth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the fourth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the 
remainder of the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS 
that produces an FK-520 derivative. 

In another embodiment, a portion of the fourth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the ethylmalonyl 
CoA specific AT with a malonyl CoA, methylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; and/or deleting the inactive DH, inserting a KR, a KR and an active DH, or a 
KR, an active DH, and an ER. In addition, the KS and/or ACP can be replaced with 
another KS and/or ACP. In each of these replacements or insertions, the heterologous KS, 
AT, DH, KR, ER, or ACP coding sequence can originate from a coding sequence for 
another module of the FK-520 PKS, a PKS for a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous fourth extender module coding sequence 
can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the fourth extender module of the FK-520 PKS. 

As illustrative examples, the present invention provides recombinant genes, 
vectors, and host cells that result from the conversion of the FK-506 PKS to an FK-520 
PKS and vice-versa. In one embodiment, the invention provides a recombinant set of FK- 
506 PKS genes but in which the coding sequences for the fourth extender module or at 
least those for the AT domain in the fourth extender module have been replaced by those 
for the AT domain of the fourth extender module of the FK-520 PKS. This recombinant 
PKS can be used to produce FK-520 in recombinant host cells. In another embodiment, 
the invention provides a recombinant set of FK-520 PKS genes but in which the coding 
sequences for the fourth extender module or at least those for the AT domain in the fourth 
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extender module have been replaced by those for the AT domain of the fourth extender 
module of the FK-506 PKS. This recombinant PKS can be used to produce FK-506 in 
recombinant host cells. 

Other examples of hybrid PKS enzymes of the invention include those in which 
the AT domain of module 4 has been replaced with a malonyl specific AT domain to 
provide a PKS that produces 21-desethyl-FK520 or with a methylmalonyl specific AT 
domain to provide a PKS that produces 21-desethyl-21-methyl-FK520. Another hybrid 
PKS of the invention is prepared by replacing the AT and inactive KR domain of FK-520 
extender module 4 with a methylmalonyl specific AT and an active KR domain, such as, 
for example, from module 2 of the DEBS or oleandolide PKS enzymes, to produce 21- 
desethyl-21-methyl-22-desoxo-22-hydroxy-FK520. The compounds produced by these 
hybrid PKS enzymes are neurotrophins. 

The fifth extender module of the FK-520 PKS includes a KS, an AT that binds 
methylmalonyl CoA, a DH, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the fifth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 fifth 
extender module is inserted into a DNA compound that comprises the coding sequence 
for a heterologous PKS. The resulting construct, in which the coding sequence for a 
module of the heterologous PKS is either replaced by that for the fifth extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of 
the heterologous PKS, provides a novel PKS. In another embodiment, a DNA compound 
comprising a sequence that encodes the fifth extender module of the FK-520 PKS is 
inserted into a DNA compound that comprises the coding sequence for the FK-520 PKS 
or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the fifth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one or both of the DH and KR; replacing any one or both of the 



dc- 176500 



PATENT 

AttyDkt: 300622002600 

-60- 

DH and KR with either a KR and/or DH; and/or inserting an ER. In addition, the KS 
and/or ACP can be replaced with another KS and/or ACP. In each of these replacements 
or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous fifth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the fifth 
extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH domain of the fifth 
extender module have been deleted or mutated to render the DH non-functional. In one 
such mutated gene, the KR and DH coding sequences are replaced with those encoding 
only a KR domain from another PKS gene. The resulting PKS genes code for the 
expression of an FK-520 PKS that produces an FK-520 analog that lacks the C- 19 to C- 
20 double bond of FK-520 and has a C-20 hydroxyl group. Such analogs are preferred 
neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant fifth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this fifth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (lacking the C- 19 to C-20 double 
bond of FK-506 and having a C-20 hydroxyl group) FK-506 derivative. In another 
embodiment, the present invention provides a recombinant FK-506 PKS in which the DH 
domain of module 5 has been deleted or otherwise rendered inactive and thus produces 
this novel polyketide. 
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The sixth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the sixth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 sixth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the sixth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the sixth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the sixth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 
any one, two, or all three of the KR, DH, and ER with another KR, DH, and ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous sixth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the sixth 
extender module of the FK-520 PKS. 
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In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH and ER domains of the 
sixth extender module have been deleted or mutated to render them non-functional. In 
one such mutated gene, the KR, ER, and DH coding sequences are replaced with those 
encoding only a KR domain from another PKS gene. This can also be accomplished by 
simply replacing the coding sequences for extender module six with those for an extender 
module having a methylmalonyl specific AT and only a KR domain from a heterologous 
PKS gene, such as, for example, the coding sequences for extender module two encoded 
by the eryAIgem. The resulting PKS genes code for the expression of an FK-520 PKS 
that produces an FK-520 analog that has a C- 18 hydroxyl group. Such analogs are 
preferred neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant sixth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this sixth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (having a C-18 hydroxyl group) 
FK-506 derivative. In another embodiment, the present invention provides a recombinant 
FK-506 PKS in which the DH and ER domains of module 6 have been deleted or 
otherwise rendered inactive and thus produces this novel polyketide. 

The seventh extender module of the FK-520 PKS includes a KS, an AT specific 
for 2-hydroxymalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the seventh extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 seventh extender module is inserted into a DNA compound that comprises 
the coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the seventh 
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extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the seventh extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion or all of the seventh extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 2- 
hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting the KR, the DH, and/or the ER; and/or replacing the 
KR, DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT, DH, 
KR, ER, or ACP coding sequence can originate from a coding sequence for another 
module of the FK-520 PKS, from a coding sequence for a PKS that produces a polyketide 
other than FK-520, or from chemical synthesis. The resulting heterologous seventh 
extender module coding sequence can be utilized in conjunction with a coding sequence 
for a PKS that synthesizes FK-520, an FK-520 derivative, or another polyketide. In 
similar fashion, the corresponding domains in a module of a heterologous PKS can be 
replaced by one or more domains of the seventh extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the seventh 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-15 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant seventh extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 
illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
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contains both this seventh extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-15- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 7 has been replaced and 
thus produces this novel polyketide. 

In another illustrative embodiment, the present invention provides a hybrid PKS 
in which the AT and KR domains of module 7 of the FK-520 PKS are replaced by a 
methylmalonyl specific AT domain and an inactive KR domain, such as, for example, the 
AT and KR domains of extender module 6 of the rapamycin PKS. The resulting hybrid 
PKS produces 15-desmethoxy-15-methyl-16-oxo-FK-520, a neurotrophin compound. 

The eighth extender module of the FK-520 PKS includes a KS, an AT specific for 
2-hydroxymalonyl Co A, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the eighth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
eighth extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the eighth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the eighth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the eighth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the 2- 
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hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting or replacing the KR; and/or inserting a DH or a DH 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous eighth extender module coding sequence 
can be utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, 
or another polyketide. In similar fashion, the corresponding domains in a module of a 
heterologous PKS can be replaced by one or more domains of the eighth extender module 
of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the eighth 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-13 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant eighth extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 
illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
contains both this eighth extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-13- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 8 has been replaced and 
thus produces this novel polyketide. 
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The ninth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the ninth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 ninth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the ninth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the ninth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the ninth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 
any one, two, or all three of the KR, DH, and ER with another KR, DH, and/or ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous ninth extender module coding sequence can be 
utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, or 
another polyketide. In similar fashion, the corresponding domains in a module of a 
heterologous PKS can be replaced by one or more domains of the ninth extender module 
of the FK-520 PKS. 
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The tenth extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, and an ACP. The recombinant DNA compounds of the invention that 
encode the tenth extender module of the FK-520 PKS and the corresponding polypeptides 
encoded thereby are useful for a variety of applications. In one embodiment, a DNA 
compound comprising a sequence that encodes the FK-520 tenth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous 
PKS. The resulting construct, in which the coding sequence for a module of the 
heterologous PKS is either replaced by that for the tenth extender module of the FK-520 
PKS or the latter is merely added to coding sequences for the modules of the 
heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a 
DNA compound comprising a sequence that encodes the tenth extender module of the 
FK-520 PKS is inserted into a DNA compound that comprises the coding sequence for 
the remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK- 
520 derivative. 

In another embodiment, a portion or all of the tenth extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl Co A specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; and/or inserting a KR, a KR and DH, or a KR, DH, 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or 
ACP coding sequence can originate from a coding sequence for another module of the 
FK-520 PKS, from a coding sequence for a PKS that produces a polyketide other than 
FK-520, or from chemical synthesis. The resulting heterologous tenth extender module 
coding sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the tenth extender module of the FK-520 PKS. 

The FK-520 polyketide precursor produced by the action of the tenth extender 
module of the PKS is then attached to pipecolic acid and cyclized to form FK-520. The 
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enzyme FkbP is the NRPS like enzyme that catalyzes these reactions. FkbP also includes 
a thioesterase activity that cleaves the nascent FK-520 polyketide from the NRPS. The 
present invention provides recombinant DNA compounds that encode the fkbP gene and 
so provides recombinant methods for expressing the fkbP gene product in recombinant 
host cells. The recombinant^? genes of the invention include those in which the coding 
sequence for the adenylation domain has been mutated or replaced with coding sequences 
from other NRPS like enzymes so that the resulting recombinant FkbP incorporates a 
moiety other than pipecolic acid. For the construction of host cells that do not naturally 
produce pipecolic acid, the present invention provides recombinant DNA compounds that 
express the enzymes that catalyze at least some of the biosynthesis of pipecolic acid (see 
Nielsen et al., 1991, Biochem. 30: 5789-96). The flcbL gene encodes a homolog of RapL, 
a lysine cyclodeaminase responsible in part for producing the pipecolate unit added to the 
end of the polyketide chain. The JkbB and JkbL recombinant genes of the invention can be 
used in heterologous hosts to produce compounds such as FK-520 or, in conjunction with 
other PKS or NRPS genes, to produce known or novel polyketides and non-ribosmal 
peptides. 

The present invention also provides recombinant DNA compounds that encode 
the P450 oxidase and methyltransferase genes involved in the biosynthesis of FK-520. 
Figure 2 shows the various sites on the FK-520 polyketide core structure at which these 
enzymes act. By providing these genes in recombinant form, the present invention 
provides recombinant host cells that can produce FK-520. This is accomplished by 
introducing the recombinant PKS, P450 oxidase, and methyltransferase genes into a 
heterologous host cell. In a preferred embodiment, the heterologous host cell is 
Streptomyces coelicolor CH999 or Streptomyces lividans K4-1 14, as described in U.S. 
Patent No. 5,830,750 and U.S. patent application Serial Nos. 08/828,898, filed 31 Mar. 
1997, and 09/181,833, filed 28 Oct. 1998, each of which is incorporated herein by 
reference. In addition, by providing recombinant host cells that express only a subset of 
these genes, the present invention provides methods for making FK-520 precursor 
compounds not readily obtainable by other means. 
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In a related aspect, the present invention provides recombinant DNA compounds 
and vectors that are useful in generating, by homologous recombination, recombinant 
host cells that produce FK-520 precursor compounds. In this aspect of the invention, a 
native host cell that produces FK-520 is transformed with a vector (such as an SCP2* 
derived vector for Streptomyces host cells) that encodes one or more disrupted genes (i.e., 
a hydroxylase, a methyltransferase, or both) or merely flanking regions from those genes. 
When the vector integrates by homologous recombination, the native, functional gene is 
deleted or replaced by the non-functional recombinant gene, and the resulting host cell 
thus produces an FK-520 precursor. Such host cells can also be complemented by 
introduction of a modified form of the deleted or mutated non-functional gene to produce 
a novel compound. 

In one important embodiment, the present invention provides a hybrid PKS and 
the corresponding recombinant DNA compounds that encode those hybrid PKS enzymes. 
For purposes of the present invention a hybrid PKS is a recombinant PKS that comprises 
all or part of one or more modules and thioesterase/cyclase domain of a first PKS and all 
or part of one or more modules, loading module, and thioesterase/cyclase domain of a 
second PKS. In one preferred embodiment, the first PKS is all or part of the FK-520 
PKS, and the second PKS is only a portion or all of a non-FK-520 PKS. 

One example of the preferred embodiment is an FK-520 PKS in which the AT 
domain of module 8, which specifies a hydroxymalonyl Co A and from which the C-13 
methoxy group of FK-520 is derived, is replaced by an AT domain that specifies a 
malonyl, methylmalonyl, or ethylmalonyl CoA. Examples of such replacement AT 
domains include the AT domains from modules 3, 12, and 13 of the rapaymycin PKS and 
from modules 1 and 2 of the erythromycin PKS. Such replacements, conducted at the 
level of the gene for the PKS, are illustrated in the examples below. Another illustrative 
example of such a hybrid PKS includes an FK-520 PKS in which the natural loading 
module has been replaced with a loading module of another PKS. Another example of 
such a hybrid PKS is an FK-520 PKS in which the AT domain of module three is 
replaced with an AT domain that binds methylmalonyl CoA. 
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In another preferred embodiment, the first PKS is most but not all of a non-FK- 
520 PKS, and the second PKS is only a portion or all of the FK-520 PKS. An illustrative 
example of such a hybrid PKS includes an erythromycin PKS in which an AT specific for 
methylmalonyl CoA is replaced with an AT from the FK-520 PKS specfic for malonyl 
CoA. 

Those of skill in the art will recognize that all or part of either the first or second 
PKS in a hybrid PKS of the invention need not be isolated from a naturally occurring 
source. For example, only a small portion of an AT domain determines its specificity. See 
U.S. provisional patent application Serial No. 60/091,526, incorporated herein by 
reference. The state of the art in DNA synthesis allows the artisan to construct de novo 
DNA compounds of size sufficient to construct a useful portion of a PKS module or 
domain. For purposes of the present invention, such synthetic DNA compounds are 
deemed to be a portion of a PKS. 

Thus, the hybrid modules of the invention are incorporated into a PKS to provide 
a hybrid PKS of the invention. A hybrid PKS of the invention can result not only: 

(i) from fusions of heterologous domain (where heterologous means the domains 
in that module are from at least two different naturally occurring modules) coding 
sequences to produce a hybrid module coding sequence contained in a PKS gene whose 
product is incorporated into a PKS, 

but also: 

(ii) from fusions of heterologous module (where heterologous module means two 
modules are adjacent to one another that are not adjacent to one another in naturally 
occurring PKS enzymes) coding sequences to produce a hybrid coding sequence 
contained in a PKS gene whose product is incorporated into a PKS, 

(iii) from expression of one or more FK-520 PKS genes with one or more non- 
FK-520 PKS genes, including both naturally occurring and recombinant non-FK-520 
PKS genes, and 

(iv) from combinations of the foregoing. 

Various hybrid PKSs of the invention illustrating these various alternatives are described 
herein. 
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Examples of the production of a hybrid PKS by co-expression of PKS genes from 
the FK-520 PKS and another non-FK-520 PKS include hybrid PKS enzymes produced by 
coexpression of FK-520 and rapamycin PKS genes. Preferably, such hybrid PKS 
enzymes are produced in recombinant Streptomyces host cells that produce FK-520 or 
5 FK-506 but have been mutated to inactivate the gene whose function is to be replaced by 
the rapamycin PKS gene introduced to produce the hybrid PKS. Particular examples 
include (i) replacement of the fkbC gene with the rapB gene; and (ii) replacement of the 
fkbA gene with the rapC gene. The latter hybrid PKS produces 13,15-didesmethoxy-FK- 
520, if the host cell is an FK-520 producing host cell, and 13,15-didesmethoxy-FK-506, 

10 if the host cell is an FK-506 producing host cell. The compounds produced by these 
hybrid PKS enzymes are immunosuppressants and neurotrophic but can be readily 
modified to act only as neurotrophins, as described in Example 6, below. 

Other illustrative hybrid PKS enzymes of the invention are prepared by replacing 
the flcbA gene of an FK-520 or FK-506 producing host cell with a hybrid fkbA gene in 

15 which: (a) the extender module 8 through 10, inclusive, coding sequences have been 
replaced by the coding sequnces for extender modules 12 to 14, inclusive, of the 
rapamycin PKS; and (b) the module 8 coding sequences have been replaced by the 
module 8 coding sequence of the rifamycin PKS. When expressed with the other, 
naturally occurring FK-520 or FK-506 PKS genes and the genes of the modification 

20 enzymes, the resulting hybrid PKS enzymes produce, respectively, (a) 13-desmethoxy- 
FK-520 or 13-desmethoxy-FK-506; and (b) 13-desmethoxy-13-methyl-FK-520 or 13- 
desmethoxy-13-methyl-FK-506. In a preferred embodiment, these recombinant PKS 
genes of the invention are introduced into the producing host cell by a vector such as 
pHU204, which is a plamsid pRM5 derivative that has the well-characterized SCP2* 

25 replicon, the colEl replicon, the tsr and bla resistance genes, and a cos site. This vector 
can be used to introduce the recombinant fkbA replacement gene in an FK-520 or FK-506 
producing host cell (or a host cell derived therefrom in which the endogenous fkbA gene 
has either been rendered inactive by mutation, deletion or homologous recombination 
with the gene that replaces it) to produce the desired hybrid PKS. 
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In constructing hybrid PKSs of the invention, certain general methods may be 
helpful. For example, it is often beneficial to retain the framework of the module to be 
altered to make the hybrid PKS. Thus, if one desires to add DH and ER functionalities to 
a module, it is often preferred to replace the KR domain of the original module with a 
KR, DH, and ER domain-containing segment from another module, instead of merely 
inserting DH and ER domains. One can alter the stereochemical specificity of a module 
by replacement of the KS domain with a KS domain from a module that specifies a 
different stereochemistry. See Lau et al, 1999, "Dissecting the role of acyltransferase 
domains of modular polyketide synthases in the choice and stereochemical fate of 
extender units," Biochemistry 38(5): 1643 -1651, incorporated herein by reference. 
Stereochemistry can also be changed by changing the KR domain. Also, one can alter the 
specificity of an AT domain by changing only a small segment of the domain. See Lau et 
al, supra. One can also take advantage of known linker regions in PKS proteins to link 
modules from two different PKSs to create a hybrid PKS. See Gokhale et al, 16 Apr. 
1999, "Dissecting and Exploiting Intermodular Communication in Polyketide Synthases," 
Science 284: 482-485, incorporated herein by reference. 

The following Table lists references describing illustrative PKS genes and 
corresponding enzymes that can be utilized in the construction of the recombinant PKSs 
and the corresponding DNA compounds that encode them of the invention. Also 
presented are various references describing tailoring enzymes and corresponding genes 
that can be employed in accordance with the methods of the present invention. 
Avermectin 

U.S. Pat. No. 5,252,474 to Merck. 

MacNeil et al, 1993, Industrial Microorganisms: Basic and Applied Molecular 
Genetics , Baltz, Hegeman, & Skatrud, eds. (ASM), pp. 245-256, A Comparison of the 
Genes Encoding the Polyketide Synthases for Avermectin, Erythromycin, and 
Nemadectin. 

MacNeil et al, 1992, Gene 115: 1 19-125, Complex Organization of the 
Streptomyces avermitilis genes encoding the avermectin polyketide synthase. 
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Ikeda et al, Aug. 1999, Organization of the biosynthetic gene cluster for the 
polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis, Proc. Natl. 
Acad. Sci. USA 96: 9509-9514. 
Candicidin (FR008) 

Kuetal, 1994, Mol. Microbiol. 14: 163-172. 
Epothilone 

U.S. Pat. App. Serial No. 60/130,560, filed 22 April 1999. 
Erythromycin 

PCT Pub. No. 93/13663 to Abbott. 

US Pat. No. 5,824,513 to Abbott. 

Donadio et al, 1991, Science 252:675-9. 

Cortes et al, 8 Nov. 1990, Nature 348:176-8, An unusually large 
multifunctional polypeptide in the erythromycin producing polyketide synthase of 
Saccharopolyspora erythraea. 

Glycosylation Enzymes 

PCT Pat. App. Pub. No. 97/23630 to Abbott. 
FK-506 

Motamedi et al, 1998, The biosynthetic gene cluster for the macrolactone ring of 
the immunosuppressant FK-506, Eur. J. biochem. 256: 528-534. 

Motamedi et al, 1997, Structural organization of a multifunctional polyketide 
synthase involved in the biosynthesis of the macrolide immunosuppressant FK-506, Eur. 
J. Biochem. 244: 74-80. 

Methyltransferase 

US 5,264,355, issued 23 Nov. 1993, Methylating enzyme from 
Streptomyces MA6858. 31-O-desmethyl-FK-506 methyltransferase. 

Motamedi et al, 1996, Characterization of methyltransferase and 
hydroxylase genes involved in the biosynthesis of the immunosuppressants FK-506 and 
FK-520, J Bacteriol. 178: 5243-5248. 
Streptomyces hygroscopicus 

U.S. patent application Serial No. 09/154,083, filed 16 Sep. 1998. 
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Lovastatin 

U.S. Pat. No. 5,744,350 to Merck. 
Narbomycin 

U.S. patent application Serial No. 60/107,093, filed 5 Nov. 1998, and Serial No. 
60/120,254, filed 16 Feb. 1999. 
Nemadectin 

MacNeil etal, 1993, supra. 
Niddamycin 

Kakavas et al, 1997, Identification and characterization of the niddamycin 
polyketide synthase genes from Streptomyces caelestis,J. Bacteriol 179: 7515-7522. 
Oleandomycin 

Swan et al, 1994, Characterisation of a Streptomyces antibioticus gene encoding 
a type I polyketide synthase which has an unusual coding sequence, Mol. Gen. Genet. 
242: 358-362. 

U.S. patent application Serial No. 60/120,254, filed 16 Feb. 1999. 

Olano et al, 1998, Analysis of a Streptomyces antibioticus chromosomal region 
involved in oleandomycin biosynthesis, which encodes two glycosyltransferases 
responsible for glycosylation of the macrolactone ring, Mol. Gen. Genet. 259(3): 299- 
308. 

Picromycin 

PCT patent application US99/15047, filed 2 Jul. 1999. 
Xue et al, 1998, Hydroxylation of macrolactones YC-17 and narbomycin is 
mediated by the/?/AC-encoded cytochrome P450 in Streptomyces venezuelae, Chemistry 
& Biology 5(11): 661-667. 

Xue et al, Oct. 1998, A gene cluster for macrolide antibiotic biosynthesis in 
Streptomyces venezuelae: Architecture of metabolic diversity, Proc. Natl. Acad Sci. 
USA 95: 12111 12116. 
Platenolide 

EP Pat. App. Pub. No. 791,656 to Lilly. 
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Rapamycin 

Schwecke et al, Aug. 1995, The biosynthetic gene cluster for the 
polyketide rapamycin, Proc. Natl. Acad. Sci. USA P2:7839-7843. 

Aparicio et al, 1996, Organization of the biosynthetic gene cluster for rapamycin 
in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular 
polyketide synthase, Gene 169: 9-16. 
Rifamycin 

August et al, 13 Feb. 1998, Biosynthesis of the ansamycin antibiotic rifamycin: 
deductions from the molecular analysis of the n/biosynthetic gene cluster of 
Amycolatopsis mediterranei S669, Chemistry & Biology, 5(2): 69-79. 
Sorangium PKS 

U.S. patent application Serial No. 09/144,085, filed 31 Aug. 1998. 
Soraphen 

U.S. Pat. No. 5,716,849 to Novartis. 

Schupp et al, 1995, J. Bacteriology 177: 3673-3679. A Sorangium cellulosum 
(Myxobacterium) Gene Cluster for the Biosynthesis of the Macrolide Antibiotic 
Soraphen A: Cloning, Characterization, and Homology to Polyketide Synthase Genes 
from Actinomycetes. 
Spiramycin 

U.S. Pat. No. 5,098,837 to Lilly. 

Activator Gene 

U.S. Pat. No. 5,514,544 to Lilly. 
Tylosin 

EP Pub. No. 791,655 to Lilly. 
U.S. Pat. No. 5,876,991 to Lilly. 

Kuhstoss et al, 1996, Gene 755:231-6., Production of a novel polyketide through 
the construction of a hybrid polyketide synthase. 
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Tailoring enzymes 

Merson-Davies and Cundliffe, 1994, MoL Microbiol 13: 349-355. Analysis of 
five tylosin biosynthetic genes from the tylBA region of the Streptomyces fradiae 
genome. 

As the above Table illustrates, there are a wide variety of polyketide synthase 
genes that serve as readily available sources of DNA and sequence information for use in 
constructing the hybrid PKS-encoding DNA compounds of the invention. Methods for 
constructing hybrid PKS-encoding DNA compounds are described without reference to 
the FK-520 PKS in PCT patent publication No. 98/51695; U.S. Patent Nos. 5,672,491 
and 5,712,146 and U.S. patent application Serial Nos. 09/073,538, filed 6 May 1998, and 
09/141,908, filed 28 Aug 1998, each of which is incorporated herein by reference. 

The hybrid PKS-encoding DNA compounds of the invention can be and often are 
hybrids of more than two PKS genes. Moreover, there are often two or more modules in 
the hybrid PKS in which all or part of the module is derived from a second (or third) 
PKS. Thus, as one illustrative example, the present invention provides a hybrid FK-520 
PKS that contains the naturally occurring loading module and FkbP as well as modules 
one, two, four, six, seven, and eight, nine, and ten of the FK-520 PKS and further 
contains hybrid or heterologous modules three and five. Hybrid or heterologous module 
three contains an AT domain that is specific of methylmalonyl CoA and can be derived 
for example, from the erythromycin or rapamycin PKS genes. Hybrid or heterologous 
module five contains an AT domain that is specific for malonyl CoA and can be derived 
for example, from the picromycin or rapamycin PKS genes. 

While an important embodiment of the present invention relates to hybrid PKS 
enzymes and corresponding genes, the present invention also provides recombinant FK- 
520 PKS genes in which there is no second PKS gene sequence present but which differ 
from the FK-520 PKS gene by one or more deletions. The deletions can encompass one 
or more modules and/or can be limited to a partial deletion within one or more modules. 
When a deletion encompasses an entire module, the resulting FK-520 derivative is at 
least two carbons shorter than the gene from which it was derived. When a deletion is 
within a module, the deletion typically encompasses a KR, DH, or ER domain, or both 
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DH and ER domains, or both KR and DH domains, or all three KR, DH, and ER 
domains. 

To construct a hybrid PKS or FK-520 derivative PKS gene of the invention, one 
can employ a technique, described in PCT Pub. No. 98/27203 and U.S. patent application 
Serial No. 08/989,332, filed 1 1 Dec. 1997, each of which is incorporated herein by 
reference, in which the large PKS gene is divided into two or more, typically three, 
segments, and each segment is placed on a separate expression vector. In this manner, 
each of the segments of the gene can be altered, and various altered segments can be 
combined in a single host cell to provide a recombinant PKS gene of the invention. This 
technique makes more efficient the construction of large libraries of recombinant PKS 
genes, vectors for expressing those genes, and host cells comprising those vectors. 

Thus, in one important embodiment, the recombinant DNA compounds of the 
invention are expression vectors. As used herein, the term expression vector refers to any 
nucleic acid that can be introduced into a host cell or cell-free transcription and 
translation medium. An expression vector can be maintained stably or transiently in a 
cell, whether as part of the chromosomal or other DNA in the cell or in any cellular 
compartment, such as a replicating vector in the cytoplasm. An expression vector also 
comprises a gene that serves to produce RNA that is translated into a polypeptide in the 
cell or cell extract. Furthermore, expression vectors typically contain additional 
functional elements, such as resistance-conferring genes to act as selectable markers. 

The various components of an expression vector can vary widely, depending on 
the intended use of the vector. In particular, the components depend on the host cell(s) in 
which the vector will be used or is intended to function. Vector components for 
expression and maintenance of vectors in E. coli are widely known and commercially 
available, as are vector components for other commonly used organisms, such as yeast 
cells and Streptomyces cells. 

In a preferred embodiment, the expression vectors of the invention are used to 
construct recombinant Streptomyces host cells that express a recombinant PKS of the 
invention. Preferred Streptomyces host cell/vector combinations of the invention include 
S. coelicolor CH999 and S. lividans K4-1 14 host cells, which do not produce 
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actinorhodin, and expression vectors derived from the pRMl and pRM5 vectors, as 
described in U.S. Patent No. 5,830,750 and U.S. patent application Serial Nos. 
08/828,898, filed 31 Mar. 1997, and 09/181,833, filed 28 Oct. 1998, each of which is 
incorporated herein by reference. 

The present invention provides a wide variety of expression vectors for use in 
Streptomyces. For replicating vectors, the origin of replication can be, for example and 
without limitation, a low copy number vector, such as SCP2* (see Hopwood et al, 
Genetic Manipulation of Streptomyces: A Laboratory manual (The John Innes 
Foundation, Norwich, U.K., 1985); Lydiate et al, 1985, Gene 35: 223-235; and Kieser 
and Melton, 1988, Gene 65: 83-91, each of which is incorporated herein by reference), 
SLP1.2 (Thompson et al, 1982, Gene 20: 51-62, incorporated herein by reference), and 
SG5(ts) (Muth et al, 1989, Mol. Gen. Genet. 219: 341-348, and Bierman et al, 1992, 
Gene 116: 43-49, each of which is incorporated herein by reference), or a high copy 
number vector, such as pIJlOl and pJVl (see Katz etal, 1983, J. Gen. Microbiol. 129: 
2703-2714; Vara etal, 1989, J Bacteriol. 171: 5782-5781; and Servin-Gonzalez, 1993, 
Plasmid 30: 131-140, each of which is incorporated herein by reference). Generally, 
however, high copy number vectors are not preferred for expression of genes contained 
on large segments of DNA. For non-replicating and integrating vectors, it is useful to 
include at least an E. coli origin of replication, such as from pUC, p IP, pi I, and pBR. For 
phage based vectors, the phages phiC31 and KC515 can be employed (see Hopwood et 
al, supra). 

Typically, the expression vector will comprise one or more marker genes by 
which host cells containing the vector can be identified and/or selected. Useful antibiotic 
resistance conferring genes for use in Streptomyces host cells include the ermE (confers 
resistance to erythromycin and other macrolides and lincomycin), tsr (confers resistance 
to thiostrepton), aadA (confers resistance to spectinomycin and streptomycin), aacC4 
(confers resistance to apramycin, kanamycin, gentamicin, geneticin (G418), and 
neomycin), hyg (confers resistance to hygromycin), and vph (confers resistance to 
viomycin) resistance conferring genes. 
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The recombinant PKS gene on the vector will be under the control of a promoter, 
typically with an attendant ribosome binding site sequence. The present invention 
provides the endogenous promoters of the FK-520 PKS and related biosynthetic genes in 
recombinant form, and these promoters are preferred for use in the native hosts and in 
heterologous hosts in which the promoters function. A preferred promoter of the 
invention is the fkbO gene promoter, comprised in a sequence of about 270 bp between 
the start of the open reading frames of the flcbO and flcbB genes. The flcbO promoter is 
believed to be bi-directional in that it promotes transcription of the genesflcbO,flcbP, and 
flcbA in one direction and flcbB, flcbC, and flcbL in the other. Thus, in one aspect, the 
present invention provides a recombinant expression vector comprising the promoter of 
thefkbO gene of an FK-520 producing organism positioned to transcribe a gene other 
tiimflcbO. In a preferred embodiment the transcribed gene is an FK-520 PKS gene. In 
another preferred embodiment, the transcribed gene is a gene that encodes a protein 
comprised in a hybrid PKS. 

Heterologous promoters can also be employed and are preferred for use in host 
cells in which the endogenous FK-520 PKS gene promoters do not function or function 
poorly. A preferred heterologous promoter is the actl promoter and its attendant activator 
gene actIl-ORF4, which is provided in the pRMl and pRM5 expression vectors, supra. 
This promoter is activated in the stationary phase of growth when secondary metabolites 
are normally synthesized. Other useful Streptomyces promoters include without limitation 
those from the ermE gene and the melCl gene, which act constitutively, and the tipA 
gene and the merA gene, which can be induced at any growth stage. In addition, the T7 
RNA polymerase system has been transferred to Streptomyces and can be employed in 
the vectors and host cells of the invention. In this system, the coding sequence for the T7 
RNA polymerase is inserted into a neutral site of the chromosome or in a vector under the 
control of the inducible merA promoter, and the gene of interest is placed under the 
control of the T7 promoter. As noted above, one or more activator genes can also be 
employed to enhance the activity of a promoter. Activator genes in addition to the actll- 
ORF4 gene discussed above include dnrl, redD, and ptpA genes (see U.S. patent 
application Serial No. 09/181,833, supra) to activate promoters under their control. 
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In addition to providing recombinant DNA compounds that encode the FK-520 
PKS, the present invention also provides DNA compounds that encode the ethylmalonyl 
CoA and 2-hydroxymalonyl CoA utilized in the synthesis of FK-520. Thus, the present 
invention also provides recombinant host cells that express the genes required for the 
biosynthesis of ethylmalonyl CoA and 2-hydroxymalonyl CoA. Figures 3 and 4 show the 
location of these genes on the cosmids of the invention and the biosynthetic pathway that 
produces ethylmalonyl CoA. 

For 2-hydroxymalonyl CoA biosynthesis, the fkbH,fkbI,fkbJ, and flcbK genes are 
sufficient to confer this ability on Streptomcyces host cells. For conversion of 2- 
hydroxymalonyl to 2-methoxymalonyl, the JkbG gene is also employed. While the 
complete coding sequence for flcbH is provided on the cosmids of the invention, the 
sequence for this gene provided herein may be missing a T residue, based on a 
comparison made with a similar gene cloned from the ansamitocin gene cluster by Dr. H. 
Floss. Where the sequence herein shows one T, there may be two, resulting in an 
extension of the flcbH reading frame to encode the amino acid sequence: 
MTIVKCLVWDLDNTLWRGTVLEDDEVVLTDEIREVITTLDDRGILQAVASKNDH 
DLAWERLERLGVAEYFVLARIGWGPKSQSVREIATELNFAPTTIAFIDDQPAERA 
EVAFHLPEVRCYPAEQAATLLSLPEFSPPVSTVDSRRRRLMYQAGFARDQAREA 
YSGPDEDFLRSLDLSMTIAPAGEEELSRVEELTLRTSQMNATGVHYSDADLRALL 
TDPAHEVLVVTMGDRFGPHGAVGIILLEKKPSTWHLKLLATSCRVVSFGAGATIL 
NWLTDQGARAGAHLVADFRRTDRNRMMEIAYRFAGFADSDCPCVSEVAGASA 
AGVERLHLEPSARPAPTTLTLTAADIAPVTVSAAG. 

For ethylmalonyl CoA biosynthesis, one requires only a crotonyl CoA reductase, 
which can be supplied by the host cell but can also be supplied by recombinant 
expression of ihefkbS gene of the present invention. To increase yield of ethylmalonyl 
CoA, one can also express iheflcbE and flcbU genes as well. While such production can 
be achieved using only the recombinant genes above, one can also achieve such 
production by placing into the recombinant host cell a large segment of the DNA 
provided by the cosmids of the invention. Thus, for 2-hydroxymalonyl and 2- 
methoxymalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
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DNA located on the left side of the FK-520 PKS genes shown in Figure 1. For 
ethylmalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
DNA located on the right side of the FK-520 PKS genes shown in Figure 1 or, 
alternatively, both the right and left segments of DNA. 

The recombinant DNA expression vectors that encode these genes can be used to 
construct recombinant host cells that can make these important polyketide building 
blocks from cells that otherwise are unable to produce them. For example, Streptomyces 
coelicolor and Streptomyces lividans do not synthesisze ethylmalonyl CoA or 2- 
hydroxymalonyl CoA. The invention provides methods and vectors for constructing 
recombinant Streptomyces coelicolor and Streptomyces lividans that are able to 
synthesize either or both ethylmalonyl CoA and 2-hydroxymalonyl CoA. These host cells 
are thus able to make polyketides, those requiring these substrates, that cannot otherwise 
be made in such cells. 

In a preferred embodiment, the present invention provides recombinant 
Streptomyces host cells, such as S. coelicolor and S. lividans, that have been transformed 
with a recombinant vector of the invention that codes for the expression of the 
ethylmalonyl CoA biosynthetic genes. The resulting host cells produce ethylmalonyl 
CoA and so are preferred host cells for the production of polyketides produced by PKS 
enzymes that comprise one or more AT domains specific for ethylmalonyl CoA. 
Illustrative PKS enzymes of this type include the FK-520 PKS and a recombinant PKS in 
which one or more AT domains is specific for ethylmalonyl CoA. 

In a related embodiment, the present invention provides Streptomyces host cells in 
which one or more of the ethylmalonyl or 2-hydroxymalonyl biosynthetic genes have 
been deleted by homologous recombination or rendered inactive by mutation. For 
example, deletion or inactivation of the flcbG gene can prevent formation of the methoxyl 
groups at C-13 and C-15 of FK-520 (or, in the corresponding FK-506 producing ceil, FK- 
506), leading to the production of 13,15-didesmethoxy-13,15-dihydroxy-FK-520 (or, in 
the corresponding FK-506 producing cell, 1 3,1 5-didesmethoxy- 13,1 5-dihydroxy-FK- 
506). If the/toG gene product acts on 2-hydroxymalonyl and the resulting 2- 
methoxymalonyl substrate is required for incorporation by the PKS, the AT domains of 
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modules 7 and 8 may bind malonyl CoA and methylmalonyl CoA. Such incorporation 
results in the production of a mixture of polyketides in which the methoxy groups at C-13 
and C-15 of FK-520 (or FK-506) are replaced by either hydrogen or methyl. 

This possibility of non-specific binding results from the construction of a hybrid 
PKS of the invention in which the AT domain of module 8 of the FK-520 PKS replaced 
the AT domain of module 6 of DEBS. The resulting PKS produced, in Streptomyces 
lividans, 6-dEB and 2-desmethyl-6-dEB, indicating that the AT domain of module 8 of 
the FK-520 PKS could bind malonyl CoA and methylmalonyl CoA substrates. Thus, one 
could possibly also prepare the 13,15-didesmethoxy-FK-520 and corresponding FK-506 
compounds of the invention by deleting or otherwise inactivating one or more or all of 
the genes required for 2-hydroxymalonyl CoA biosynthesis, i.e., the fkbH.fkblfkbJ, and 
JkbK genes. In any event, the deletion or inactivation of one or more biosynthetic genes 
required for ethylmalonyl and/or 2-hydroxymalonyl production prevents the formation of 
polyketides requiring ethylmalonyl and/or 2-hydroxymalonyl for biosynthesis, and the 
resulting host cells are thus preferred for production of polyketides that do not require the 
same. 

The host cells of the invention can be grown and fermented under conditions 
known in the art for other purposes to produce the compounds of the invention. See, e.g., 
U.S. Patent Nos. 5,194,378; 5,116,756; and 5,494,820, incorporated herein by reference, 
for suitable fermentation processes. The compounds of the invention can be isolated from 
the fermentation broths of these cultured cells and purified by standard procedures. 
Preferred compounds of the invention include the following compounds: 13-desmethoxy- 
FK-506; 13-desmethoxy-FK-520; 13,15-didesmethoxy-FK-506; 13,15-didesmethoxy- 
FK-520; 1 3-desmethoxy-l 8-hydroxy-FK-506; 1 3-desmethoxy- 1 8-hydroxy-FK-520; 
13,15-didesmethoxy-18-hydroxy-FK-506; and 13,15-didesmethoxy-18-hydroxy-FK-520. 
These compounds can be further modified as described for tacrolimus and FK-520 in 
U.S. Patent Nos. 5,225,403; 5,189,042; 5,164,495; 5,068,323; 4,980,466; and 4,920,218, 
incorporated herein by reference. 

Other compounds of the invention are shown in Figure 8, Parts A and B. In Figure 
8, Part A, illustrative C-32-substituted compounds of the invention are shown in two 
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columns under the heading R. The substituted compounds are preferred for topical 
administration and are applied to the dermis for treatment of conditions such as psoriasis. 
In Figure 8, Part B, illustrative reaction schemes for making the compounds shown in 
Figure 8, Part A, are provided. In the upper scheme in Figure 8, Part B, the C-32 
substitution is a tetrazole moiety, illustrative of the groups shown in the left column 
under R in Figure 8, Part A. In the lower scheme in Figure 8, Part B, the C-32 
substitution is a disubstituted amino group, where R 3 and R4 can be any group similar to 
the illustrative groups shown attached to the amine in the right column under R in Figure 
8, Part A. While Figure 8 shows the C-32-substituted compounds in which the C-15- 
methoxy is present, the invention includes these C-32 -substituted compounds in which C- 
15 is ethyl, methyl, or hydrogen. Also, while C-21 is shown as substituted with ethyl or 
allyl, the compounds of the invention includes the C-32-substituted compounds in which 
C-21 is substituted with hydrogen or methyl. 

To make these C-32-substituted compounds, Figure 8, Part B, provides illustrative 
reaction schemes. Thus, a selective reaction of the starting compound (see Figure 8, Part 
B, for an illustrative starting compound) with trifluoromethanesulfonic anhydride in the 
presence of a base yields the C-32 O-triflate derivative, as shown in the upper scheme of 
Figure 8, Part B. Displacement of the triflate with lH-tetrazole or triazole derivatives 
provides the C-32 tetrazole or teiazole derivative. As shown in the lower scheme of 
Figure 8, Part B, reacting the starting compound with p-nitrophenylchloroformate yields 
the correspoinding carbonate, which, upon displacement with an amino compound, 
provides the corresponding carbamate derivative. 

The compounds can be readily formulated to provide the pharmaceutical 
compositions of the invention. The pharmaceutical compositions of the invention can be 
used in the form of a pharmaceutical preparation, for example, in solid, semisolid, or 
liquid form. This preparation contains one or more of the compounds of the invention as 
an active ingredient in admixture with an organic or inorganic carrier or excipient 
suitable for external, enteral, or parenteral application. The active ingredient may be 
compounded, for example, with the usual non-toxic, pharmaceutically acceptable carriers 
for tablets, pellets, capsules, suppositories, solutions, emulsions, suspensions, and any 
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other form suitable for use. Suitable formulation processes and compositions for the 
compounds of the present invention are described with respect to tacrolimus in U.S. 
Patent Nos. 5,939,427; 5,922,729; 5,385,907; 5,338,684; and 5,260,301, incorporated 
herein by reference. Many of the compounds of the invention contain one or more chiral 
centers, and all of the stereoisomers are included within the scope of the invention, as 
pure compounds as well as mixtures of stereoisomers. Thus the compounds of the 
invention may be supplied as a mixture of stereoisomers in any proportion. 

The carriers which can be used include water, glucose, lactose, gum acacia, 
gelatin, mannitol, starch paste, magnesium trisilicate, talc, com starch, keratin, colloidal 
silica, potato starch, urea, and other carriers suitable for use in manufacturing 
preparations, in solid, semi-solid, or liquified form. In addition, auxiliary stabilizing, 
thickening, and coloring agents and perfumes may be used. For example, the compounds 
of the invention may be utilized with hydroxypropyl methylcellulose essentially as 
described in U.S. Patent No. 4,916,138, incorporated herein by reference, or with a 
surfactant essentially as described in EPO patent publication No. 428,169, incorporated 
herein by reference. 

Oral dosage forms may be prepared essentially as described by Hondo et al, 
1987, Transplantation Proceedings XIX, Supp. 6: 17-22, incorporated herein by 
reference. Dosage forms for external application may be prepared essentially as described 
in EPO patent publication No. 423,714, incorporated herein by reference. The active 
compound is included in the pharmaceutical composition in an amount sufficient to 
produce the desired effect upon the disease process or condition. 

For the treatment of conditions and diseases relating to immunosuppression or 
neuronal damage, a compound of the invention may be administered orally, topically, 
parenterally, by inhalation spray, or rectally in dosage unit formulations containing 
conventional non-toxic pharmaceutically acceptable carriers, adjuvant, and vehicles. The 
term parenteral, as used herein, includes subcutaneous injections, and intravenous, 
intramuscular, and intrasternal injection or infusion techniques. 

Dosage levels of the compounds of the present invention are of the order from 
about 0.01 mg to about 50 mg per kilogram of body weight per day, preferably from 
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about 0. 1 mg to about 10 mg per kilogram of body weight per day. The dosage levels are 
useful in the treatment of the above-indicated conditions (from about 0.7 mg to about 3.5 
mg per patient per day, assuming a 70 kg patient). In addition, the compounds of the 
present invention may be administered on an intermittent basis, i.e., at semi-weekly, 
weekly, semi-monthly, or monthly intervals. 

The amount of active ingredient that may be combined with the carrier materials 
to produce a single dosage form will vary depending upon the host treated and the 
particular mode of administration. For example, a formulation intended for oral 
administration to humans may contain from 0.5 mg to 5 g of active agent compounded 
with an appropriate and convenient amount of carrier material, which may vary from 
about 5 percent to about 95 percent of the total composition. Dosage unit forms will 
generally contain from about 0.5 mg to about 500 mg of active ingredient. For external 
administration, the compounds of the invention can be formulated within the range of, for 
example, 0.00001% to 60% by weight, preferably from 0.001% to 10% by weight, and 
most preferably from about 0.005% to 0.8% by weight. The compounds and 
compositions of the invention are useful in treating disease conditions using doses and 
administration schedules as described for tacrolimus in U.S. Patent Nos. 5,542,436; 
5,365,948; 5,348,966; and 5,196,437, incorporated herein by reference. The compounds 
of the invention can be used as single therapeutic agents or in combination with other 
therapeutic agents. Drugs that can be usefully combined with compounds of the invention 
include one or more immunosuppressant agents such as rapamycin, cyclosporin A, FK- 
506, or one or more neurotrophic agents. 

It will be understood, however, that the specific dosage level for any particular 
patient will depend on a variety of factors. These factors include the activity of the 
specific compound employed; the age, body weight, general health, sex, and diet of the 
subject; the time and route of administration and the rate of excretion of the drug; 
whether a drug combination is employed in the treatment; and the severity of the 
particular disease or condition for which therapy is sought. 
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A detailed description of the invention having been provided above, the following 
examples are given for the purpose of illustrating the present invention and shall not be 
construed as being a limitation on the scope of the invention or claims. 

Example 1 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-520 
The C-13 methoxyl group is introduced into FK-520 via an AT domain in 
extender module 8 of the PKS that is specific for hydroxymalonyl and by methylation of 
the hydroxyl group by an S-adenosyl methionine (SAM) dependent methyltransferase. 
Metabolism of FK-506 and FK-520 primarily involves oxidation at the C-13 position into 
an inactive derivative that is further degraded by host P450 and other enzymes. The 
present invention provides compounds related in structure to FK-506 and FK-520 that do 
not contain the C-13 methoxy group and exhibit greater stability and a longer half-life in 
vivo. These compounds are useful medicaments due to their immunosuppressive and 
neurotrophic activities, and the invention provides the compounds in purified form and as 
pharmaceutical compositions. 

The present invention also provides the novel PKS enzymes that produce these 
novel compounds as well as the expression vectors and host cells that produce the novel 
PKS enzymes. The novel PKS enzymes include, among others, those that contain an AT 
domain specific for either malonyl Co A or methylmalonyl Co A in module 8 of the FK- 
506 and FK-520 PKS. This example describes the construction of recombinant DNA 
compounds that encode the novel FK-520 PKS enzymes and the transformation of host 
cells with those recombinant DNA compounds to produce the novel PKS enzymes and 
the polyketides produced thereby. 

To construct an expression cassette for performing module 8 AT domain 
replacements in the FK-520 PKS, a 4.6 kb Sphl fragment from the FK-520 gene cluster 
was cloned into plasmid pLitmus 38 (a cloning vector available from New England 
Biolabs). The 4.6 kb Sphl fragment, which encodes the ACP domain of module 7 
followed by module 8 through the KR domain, was isolated from an agarose gel after 
digesting the cosmid pKOS65-C31 with Sph I. The clone having the insert oriented so 
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the single Sacl site was nearest to the Spel end of the polylinker was identified and 

designated as plasmid pKOS60-21-67. To generate appropriate cloning sites, two linkers 

were ligated sequentially as follows. First, a linker was ligated between the Spel and 

Sacl sites to introduce a Bglll site at the 5' end of the cassette, to eliminate interfering 

polylinker sites, and to reduce the total insert size to 4.5 kb (the limit of the phage 

KC515). The ligation reactions contained 5 picomolar unphosphorylated linker DNA and 

0.1 picomolar vector DNA, i.e., a 50-fold molar excess of linker to vector. The linker had 

the following sequence: 

5 '-CTAGTGGGCAGATCTGGC AGCT-3 ' 
3'-ACCCGTCTAGACCG-5' 

The resulting plasmid was designated pKOS60-27-l. 

Next, a linker of the following sequence was ligated between the unique Sphl and 

Aflll sites of plasmid pKOS60-27-l to introduce an Nsil site at the 3' end of the module 8 

cassette. The linker employed was: 

5 ' -GGG ATGC ATGGC-3 ' 
3 '-GTACCCCTACGTACCGAATT-5' 

The resulting plasmid was designated pKOS60-29-55. 

To allow in-frame insertions of alternative AT domains, sites were engineered at 

the 5' end (Avr II or Nhe I) and 3' end (Xho I) of the AT domain using the polymerase 

chain reaction (PCR) as follows. Plasmid pKOS60-29-55 was used as a template for the 

PCR and sequence 5' to the AT domain was amplified with the primers SpeBgl-fwd and 

either Avr-rev or Nhe-rev: 

SpeBgl-fwd 5 '-CGACTCACTAGTGGGCAGATCTGG-3 ' 

Avr-rev 5'-CACGCCTAGGCCGGTCGGTCTCGGGCCAC-3 ' 

Nhe-rev 5 ' -GCGGCTAGCTGCTCGCCC ATCGCGGGATGC-3 ' 

The PCR included, in a 50 ul reaction, 5 ul of lOx Pfu polymerase buffer 

(Stratagene), 5 ul lOx z-dNTP mixture (2 mM dATP, 2 mM dCTP, 2 mM dTTP, 1 mM 

dGTP, 1 mM 7-deaza-GTP), 5 ul DMSO, 2 ul of each primer (10 uM), 1 ul of template 

DNA (0.1 ug/ul), and 1 ul of cloned Pju polymerase (Stratagene). The PCR conditions 

were 95°C for 2 min., 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 4 
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min., followed by 4 min. at 72°C and a hold at 0°C. The amplified DNA products and 
the Litmus vectors were cut with the appropriate restriction enzymes (BgKl and Avrll or 
Spel and Nhel), and cloned into either pLitmus 28 or pLitmus38 (New England Biolabs), 
respectively, to generate the constructs designated pKOS60-37-4 and pKOS60-37-2, 
5 respectively. 

Plasmid pKOS60-29-55 was again used as a template for PCR to amplify 
sequence 3' to the AT domain using the primers BsrXho-fwd and NsiAfl-rev: 

BsrXho-fwd 5 '-GATGTACAGCTCGAGTCGGCACGCCCGGCCGCATC-3 ' 

NsiAfl-rev 5 ' -CGACTC ACTTAAGCC ATGC ATCC-3 ' 
^ 1 0 PCR conditions were as described above. The PCR fragment was cut with BsrGl 

=fi and Aflll, gel isolated, and ligated into pKOS60-37-4 cut with Aspl 1 8 and Aflll and 

J inserted into pKOS60-37-2 cut with BsrGl and Aflll, to give the plasmids pKOS60-39-l 

W and pKOS60-39-13, respectively. These two plasmids can be digested with Avrll and 

y, Xhol or Nhel and Xhol, respectively, to insert heterologous AT domains specific for 

; ' 15 malonyl, methylmalonyl, ethylmalonyl, or other extender units. 

Q Malonyl and methylmalonyl-specific AT domains were cloned from the 

CO 

nj rapamycin cluster using PCR amplification with a pair of primers that introduce an ^vrll 

^ or Nhel site at the 5' end and an Xhol site at the 3' end. The PCR conditions were as 

N> given above and the primer sequences were as follows: 

20 

RATN1 5'-ATCCTAGGCGGGCRGGYGTGTCGTCCTTCGG-3' 
(3' end of Rap KS sequence and universal for malonyl and methylmalonyl Co A), 
RATMN2 5 '-ATGCTAGCCGCCGCGTTCCCCGTCTTCGCGCG-3 ' 
(Rap AT shorter version 5'- sequence and specific for malonyl Co A), 
25 RATMMN2 5 ' - ATGCTAGCGGATTCGTCGGTGGTGTTCGCCGA-3 ' 

(Rap AT shorter version 5'- sequence and specific for methylmalonyl CoA), and 
RATC 5'-ATCTCGAGCCAGTASCGCTGGTGYTGGAAGG-3 ' 
(Rap DH 5'- sequence and universal for malonyl and methylmalonyl CoA). 
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MMN2 - Nhel 
Nl - Avrll; MN2 - Nhel 







AT 




DR 







Any Rap Module 



Xhol-C 



1 0 Because of the high sequence similarity in each module of the rapamycin cluster, 

each primer was expected to prime any of the AT domains. PCR products representing 
ATs specific for malonyl or methylmalonyl extenders were identified by sequencing 
individual cloned PCR products. Sequencing also confirmed that the chosen clones 
contained no cloning artifacts. Examples of hybrid modules with the rapamycin AT 12 

1 5 and ATI 3 domains are shown in a separate figure. 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 of the 
rapamycin PKS has the DNA sequence and encodes the amino acid sequence shown 
below. The AT of rap module 12 is specific for incorporation of malonyl units. 

20 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
IWQLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
25 FKDLGI DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
30 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 
35 AS PEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

TEFPTDRGWDVDAI YD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
40 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATGFDAAFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
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EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
5 TDGFGATGSQTSVLSG 

GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 8 50 
ACSSSLVALHQAGQSLR 
10 CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
S GECS LALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
15 GRAKAFGAGADGTS FAE 

GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLSDAERN 
g GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 

^ GHTVLAVVRGSAVNQDG 

20 GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
^ ASNGLSAPNGPSQERVI 
+• CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

tJ RQALANAGLTPADVDA 
111 TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 

H 25 VEAHGTGTRLGDPI EAQ 
ff] GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

s ' AVLATYGQERAT PLLLG 

CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
|S SLKSNIGHAQAASGVA 
Sjl 30 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
[4 GI I KMV-QALRHGELPPT 

5 CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

O LHADEPSPHVDWTAGAV 

CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
35 ELLTSARPWPETDRPR 

GGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACCAACGCCCACGTCATC 1550 
RAGVSSFGISGTNAHVI 
CTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAACGCGGTGATCGAGCG 1600 
LESAPPTQPADNAVIER 
40 GGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCAGGACCCAGTCGGCTT 1650 
APEWVPLVISARTQSA 
TGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTGGCGGCGTCGCCCGGG 1700 
LTEHEGRLRAYLAASPG 
GTGGATATGCGGGCTGTGGCATCGACGCTGGCGATGACACGGTCGGTGTT 1750 
45 VDMRAVASTLAMTRSVF 

CGAGCACCGTGCCGTGCTGCTGGGAGATGACACCGTCACCGGCACCGCTG 1800 

EHRAVLLGDDTVTGTA 
TGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGACAGGGGTCGCAGCGT 1850 
VSDPRAVFVFPGQGSQR 
50 GCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCCCGTCTTCGCGCGGAT 1900 
AGMGEELAAAFPVFARI 
CCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACG 1950 

HQQVWDLLDVPDLEVN 
AGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTC 2000 
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ETGYAQPALFAMQVALF 
GGGCTGCTGGAATCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTC 2050 

GLLESWGVRPDAVIGHS 
GGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGG 2100 
5 VGELAAAYVSGVWSLE 

ATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCC 2150 
DACTLVSARARLMQALP 
GCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGC 2200 
AGGVMVAVPVSEDEARA 
10 CGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGG 2250 
VLGEGVEIAAVNGPSS 
TGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTG 2300 
VVLS GDEAAVLQAAEGL 
GGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTAT 2350 
15 GKWTRLATSHAFHSARM 

GGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACC 2400 

EPMLEEFRAVAEGLTY 
GGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAG 2450 
RTPQVSMAVGDQVTTAE 
20 TACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGC 2500 
YWVRQVRDTVRFGEQVA 
CTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGG 2550 

SYEDAVFVELGADRSL 
CCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAG 2600 
25 ARLVDGVAMLHGDHEIQ 

GCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGA 2650 

A A I GALAHLYVNGVTVD 
CTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTC 2700 
WPALLGDAPATRVLDL 
30 CGACATACGCCTTCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCG 2750 
PTYAFQHQRYWLESARP 
GCCGCATCCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGC 2800 

A A S DAGHPVLGSGIALA 
CGGGTCGCCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACC 2850 
35 GSPGRVFTGSVPTGAD 

GCGCGGTGTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGAC 2900 
RAVFVAELALAAADAVD 
TGCGCCACGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGG 2950 
CATVERLDIASVPGRPG 
40 CCATGGCCGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACG 3000 
HGRTTVQTWVDEPADD 
GCCGGCGCCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACG 3050 
GRRRFTVHTRTGDAPWT 
CTGCACGCCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGC 3100 
45 LHAEGVLRPHGTALPDA 

GGCCGACGCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGC 3150 

ADAEWPPPGAVPADGL 
CGGGTGTGTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGAC 3200 
PGVWRRGDQVFAEAEVD 
50 GGACCGGACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTC 3250 
GPDGFVVHPDLLDAVFS 
CGCGGTCGGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGG 3300 

AVGDGSRQPAGWRDLT 
TGCACGCGTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACC 3350 
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V H A S DATVLRACLTRRT 
GACGGAGCCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACT 34 00 

DGAMGFAAFDGAGLPVL 
CACCGCGGAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCG 34 50 
5 TAEAVTLREVASPSGS 

AGGAGTCGGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCG 3500 
EESDGLHRLEWLAVAEA 
GTCTACGACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCA 3550 
VYDGDLPEGHVLITAAH 
10 CCCCGACGACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCC 3600 
PDDPEDI PTRAHTRAT 
GCGTCCTGACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTC 3650 
RVLTALQHHLTTTDHTL 
ATCGTCCACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCAC 37 00 
15 IVHTTTDPAGATVTGLT 

CCGCACCGCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCG 3750 

RTAQNEHPHRIRLIET 
ACCACCCCCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCAC 3800 
DHPHTPLPLAQLATLDH 
20 CCCCACCTCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCC 3850 
PHLRLTHHTLHHPHLTP 
CCTCCACACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACG 3900 

LHTTTPPTTTPLNPEH 
CCATCATCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGC 3950 
25 AI I I TGGSGTLAGILAR 

CACCTGAACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGA 4000 

HLNHPHTYLLSRTPPPD 
CGCCACCCCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAAC 4050 
ATPGTHLPCDVGDPHQ 
30 TCGCCACCACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCAC 4100 
LATTLTHI PQPLTAIFH 
ACCGCCGCCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCG 4150 

TAATLDDGILHALTPDR 
CCTCACCACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACC 4200 
35 LTTVLHPKANAAWHLH 

ACCTCACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCC 4250 
HLTQNQPLTHFVLYSSA 
GCCGCCGTCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGC 4300 
AAVLGS PGQGNYAAA.NA 
40 CTTCCTCGACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCA 4350 
FLDALATHRHTLGQPA 
CCTCCATCGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAA 4 400 
TSIAWGMWHTTSTLTGQ 
CTCGACGACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGAT 4 4 50 
45 LDDADR DRIRRGGFLPI 
CACGGACGACGAGGGCATGGGGATGCAT 
T D D E G 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
50 with the endogenous AT domain replaced by the AT domain of module 13 (specific for 
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methylmalonyl Co A) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 

QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGIDSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

ASPEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 

TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 5 50 

TGATGFDAAFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 

TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 

ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGEC SLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
S PGGFVE FSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGADGT S FAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVL I VERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
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GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
S LKSNIGHAQAASGVA 
5 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GIIKMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADEPSPHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCT AGGC 1500 
10 ELLTSARPWPETDRPR 

GGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACCAACGCCCACGTCATC 1550 
RAGVSSFGVSGTNAHVI 
CTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGAGGCGCAGCCTGTTGA ,1600 
LESAPPAQPAEEAQPVE* 
15 GACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGGTGATATCGGCCAAGA 1650 
TPVVASDVLPLVISAK 
CCCAGCCCGCCCTGACCGAACACGAAGACCGGCTGCGCGCCTACCTGGCG 1700 
TQPALTEHEDRLRAYLA 
GCGTCGCCCGGGGCGGATATACGGGCTGTGGCATCGACGCTGGCGGTGAC 1750 
20 AS PGADIRAVASTLAVT 

ACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTGGAGATGACACCGTCA 1800 

RSVFEHRAVLLGDDTV 
CCGGCACCGCGGTGACCGACCCCAGGATCGTGTTTGTCTTTCCCGGGCAG 1850 
TGTAVTDPRIVFVFPGQ 
25 GGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCGCGATTCGTCGGTGGT 1900 
GWQWLGMGSALRDSSVV 
GTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGTTGCGCGAGTTCGTGG 1950 

FAERMAECAAALREFV 
ACTGGGATCTGTTCACGGTTCTGGATGATCCGGCGGTGGTGGACCGGGTT 2000 
30 DWDLFTVLDDPAVVDRV 

GATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGTTTCCCTGGCCGCGGT 2050 

DVVQPASWAMMVSLAAV 
GTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGATCGGCCATTCGCAGG 2100 
WQAAGVRPDAVIGHSQ 
35 GTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTGTCACTACGCGATGCC 2150 
GEIAAACVAGAVSLRDA 
GCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGCCCGGGGCCTGGCGGG 2200 

ARIVTLRSQAIARGLAG 
CCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGCAGGATGTCGAGCTGG 2250 
40 RGAMASVAL PAQDVEL 

TCGACGGGGCCTGGATCGCCGCCCACAACGGGCCCGCCTCCACCGTGATC 2300 
VDGAWIAAHNGPASTVI 
GCGGGCACCCCGGAAGCGGTCGACCATGTCCTCACCGCTCATGAGGCACA 2350 
AGTPEAVDHVLTAHEAQ 
45 AGGGGTGCGGGTGCGGCGGATCACCGTCGACTATGCCTCGCACACCCCGC 2400 
GVRVRRITVDYASHTP 
ACGTCGAGCTGATCCGCGACGAACTACTCGACATCACTAGCGACAGCAGC 2450 
HVELIRDELLDITSDSS 
TCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGTGGACGGCACCTGGGT 2500 
50 SQTPLVPWLSTVDGTWV 

CGACAGCCCGCTGGACGGGGAGT ACTGGT ACCGGAACCTGCGTGAACCGG 2550 

DSPLDGEYWYRNLREP 
TCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCCCAGGGCGACACCGTG 2600 
VGFHPAVSQLQAQGDTV 
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TTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCAGGCGATGGACGACGA 2 650 

FVEVSAS PVLLQAMDDD 
TGTCGTCACGGTTGCCACGCTGCGTCGTGACGACGGCGACGCCACCCGGA 2700 
VVTVATLRRDDGDATR 
5 TGCTCACCGCCCTGGCACAGGCCTATGTCCACGGCGTCACCGTCGACTGG 2750 
MLTALAQAYVHGVTVDW 
CCCGCCATCCTCGGCACCACCACAACCCGGGTACTGGACCTTCCGACCTA 2800 

PAILGTTTTRVLDLPTY 
CGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGGCACGCCCGGCCGCAT 2850 
10 AFQHQRYWLESARPAA 

CCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCG 2900 
SDAGHPVLGSGIALAGS 
CCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGT 2950 
PGRVFTGSVPTGADRAV 
15 GTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCA 3000 
FVAELALAAADAVDCA 
CGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGC 3050 
TVERLD IASVPGRPGHG 
CGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCG 3100 
20 RTTVQTWVDEPADDGRR 

CCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACG 3150 

RFTVHTRTGDAPWTLH 
CCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGAC 3200 
AEGVLRP HGTALPDAAD 
25 GCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGT 3250 
AEWPPPGAVPADGLPGV 
GTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGG 3300 

WRRGDQVFAEAEVDGP 
ACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTC 3350 
30 DG FVVH P DLL DAVFSAV 

GGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGC 3400 

GDGSRQPAGWRDLTVHA 
GTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAG 3450 
SDATVLRACLTRRTDG 
35 CCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCG 3500 
AMGFAAFDGAGLPVLTA 
GAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTC 355 0 

EAVTLREVASPSGSEES 
* GGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACG 3600 
40 DGLHRLEWLAVAEAVY 

ACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGAC 3650 
DGDLPEGHVLITAAHPD 
GACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCT 3700 
DPEDI PTRAHTRATRVL 
45 GACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCC 3750 
TALQHHLTTTDHTLIV 
ACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACC 3800 
HTTTDPAGATVTGLTRT 
GCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCC 3850 
50 AQNEHPHRIRLIETDHP 

CCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACC 3900 

HTPLPLAQLATLDHPH 
TCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCAC 3950 
LRLTHHTLHHPHLTPLH 
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ACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCAT 4000 

TTTPPTTTPLNPEHAI I 
CATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGA 4050 
ITGGSGTLAGI LARHL 
5 ACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACC 4100 
NHPHTYLLSRTPPPDAT 
CCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCAC 4150 

PGTHLPCDVGDPHQLAT 
CACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCG 4200 
10 TLTHIPQPLTAIFHTA 

CCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACC 4250 
ATLDDGILHALT PDRLT 
ACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCAC 4300 
TVLHPKANAAWHLHHLT 
1 5 CCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCG 4350 
QNQPLTHFVLYSSAAA 
TCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTC 44 00 
VLGS PGQGNYAAANAFL 
GACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCAT 4450 
20 DALATHRHTLGQPATS I 

CGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACG 4500 

AWGMWHTTSTLTGQLD 
ACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGAC 4550 
DADRDRIRRGGFLPITD 
25 GACGAGGGCATGGGGATGCAT 
D E G 

The Nhell-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 (specific for 
30 malonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the amino acid 
sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 

QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
35 AAVLGHVGGEDI PATAA 

GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGIDSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
40 TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
45 DEPLAIVGMACRLPGGV 

GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 

ASPEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 
TEFPTDRGWDVDAIYD 
50 CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
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PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

TGATGFDAAFFGI S PRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
5 ALAMDPQQRVLLETSW 

AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 
TGVFVGAFS YGYGTGAD 
1 0 CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 
TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
15 ACSSSLVALHQAGQSLR 

CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
20 GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTS FAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
25 GHTVLAVVRGSAVNQDG 

GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
RQALANAGLT PADVDA 
30 TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPI EAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
35 SLKSNI GHAQAASGVA 

GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 1400 
GIIKMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 1450 
LHADEPS PHVDWTAGAV 
40 CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETDRPR 
GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVSS FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 
45 LEAGPVTET PAAS P SGD 

CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 

LPLLVSARS PEALDEQ 
TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
IRRLRAYLDTTPDVDRV 
50 GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQTLARRT HFAH RAV 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 

LLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
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ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGT 1900 

EQLAAAFPVFARI HQQV 
GTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACG 1950 
5 WDLLDVPDLEVNETGY 

CCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAA 2000 
AQPALFAMQVALFGLLE 
TCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCT 2050 
SWGVRPDAVIGHSVGEL 
1 0 TGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTT 2100 
AAAYVS GVWSLEDACT 
TGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTG 2150 
LVSARARLMQALPAGGV 
ATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGA 2200 
15 MVAVPVS E DEARAVLGE 

GGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCG 2250 
GVEIAAVNGPSSVVLS 
^ GTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACG 2300 

O GDEAAVLQAAEGLGKWT 
tf? 20 CGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCT 2350 
CI RLATS HAFHSARMEPML 

,£ GGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGG 24 00 

Q EE FRAVAEGLTYRT PQ 

§7| TCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGG 2450 

C 25 VSMAVGDQVTTAEYWVR 
^ CAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGA 2500 

y? QVRDTVRFGEQVASYED 

CGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCG 2550 
P AVFVELGADRSLARLV 
ttl 30 ACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGC 2 600 
FU DGVAMLHGDHEIQAAIG 
*Sj GCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCT 2650 

p ALAHLYVNGVTVDWPAL 
£ CCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCT 27 00 

35 LGDAPATRVLDLPTYA 

TCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCGGCCGCATCCGAC 2750 
FQHQRYWLESARPAASD 
GCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCGCCGGG 2800 
AGHPVLGSGIALAGSPG 
40 CCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGTGTTCG 2850 
RVFTG SVPTGADRAVF 
TCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCACGGTC 2900 
VAE LALAAADAVDCATV 
GAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGCCGGAC 2950 
45 ERLDIASVPGRPGHGRT 

GACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCGCCGGT 3000 

TVQTWVDEPADDGRRR 
TCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACGCCGAG 3050 
FTVHTRTGDAPWTLHAE 
50 GGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGACGCCGA 3100 
GVLRPHGTALPDAADAE 
GTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGTGTGGC 3150 

WPPPGAVPADGLPGVW 
GCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGGACGGT 3200 
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RRGDQVFAEAEVDGPDG 
TTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGA 32 50 

FVVHPDLLDAVFSAVGD 
CGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGCGTCGG 3300 
5 GSRQPAGWRDLTVHAS 

ACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAGCCATG 3350 
DATVLRACLTRRTDGAM 
GGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCGGAGGC 34 00 
GFAAFDGAGLPVLTAEA 
1 0 GGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTCGGACG 3450 
VTLREVASPSGSEESD 
GCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACGACGGT 3500 
GLHRLEWLAVAEAVYDG 
GACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGACGACCC 3550 
15 DLPEGHVLITAAHPDDP 

CGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCTGACCG 3600 

E DI PTRAHTRATRVLT 
CCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCCACACC 3650 
O ALQHHLTTTDHTLIVHT 
y3 20 ACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACCGCCCA 3700 
gQ TTDPAGATVTGLTRTAQ 
% GAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCCCCACA 3750 

S NEHPHRIRLIETDHPH 

CCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACCTCCGC 3800 
7 y 25 TPLPLAQLATLDHPHLR 
^ CTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCACACCAC 3850 

P LTHHTLHHPHLTPLHTT 
s CACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCATCATCA 3900 

p TPPTTTPLNPEHAIII 
fQ 30 CCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGAACCAC 3 950 
f[i TGGSGTLAGILARHLNH 
•r\ CCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACCCCCGG 4 000 

^ PHTYLLSRTPPPDATPG 
^ CACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCACCACCC 4050 

!** 35 THLPCDVGDPHQLATT 

TCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCGCCACC 4100 
LTHI PQPLTAIFHTAAT 
CTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACCACCGT 4150 
LDDGILHALTPDRLTTV 
40 CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCACCCAAA 4200 
LHPKANAAWHLHHLTQ 
ACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCGTCCTC 4 250 
NQPLTHFVLYSSAAAVL 
GGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTCGACGC 4 300 
45 GSPGQGNYAAANAFLDA 

CCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCATCGCCT 4 350 

LATHRHTLGQPATSIA 
GGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACGACGCC 4 4 00 
WGMWHTTSTLTGQLDDA 
50 GACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGACGACGA 4450 
DRDRIRRGGFLPITDDE 
GGGCATGGGGATGCAT 
G 
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The Nhell-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 13 (specific for 
methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

5 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 
QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
10 FKDLGI DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPT PHVLAGKLGDELTG 
15 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
O T RAPVV PRTAATAGAH 

y3 ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 

=|j DE PLAI VGMACRLPGGV 

GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
Q 20 AS PEELWHLVASGTDAI 
f E l CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

2 TE FPTDRGWDVDAI YD 

l!: CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 

m PDPDAIGKTFVRHGGFL 
^ 25 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
y TGATGFDAAFFGISPRE 
CO GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

fU ALAMDPQQRVLLETSW 
\J AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 

p 30 EAFESAGITPDSTRGSD 
T[ ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
TDGFGATGSQTSVLSG 
35 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 

ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
40 SGECSLALVGGVTVMA 

CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTSFAE 
45 GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVL IVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
50 ASNGLSAPNGPSQERVI 
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CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 12 00 

RQALANAGLT PADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
5 GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
AVLATYGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

SLKSNIGHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 1400 
10 GIIKMVQALRHGELPPT 

CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 1450 

LHADE PS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETDRPR 
15 GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVS S FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 

LEAGPVTETPAASPSGD 
CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 
20 LPLLVSARSPEALDEQ 

TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
IRRLRAYLDTTPDVDRV 
GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQT LARRTH FAHRAV 
25 GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 
LLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTG 1900 
30 EQLADS SVVFAERMAEC 

TGCGGCGGCGTTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGG 1950 

AAALRE FVDWDLFTVL 
ATGATCCGGCGGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGG 2000 
DDPAVVDRVDVVQPASW 
35 GCGATGATGGTTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCC 2050 
AMMVS LAAVWQAAGVRP 
GGATGCGGTGATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGG 2100 

DAVIGHSQGEIAAACV 
CGGGTGCGGTGTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGC 2150 
40 AGAVSLRDAARIVTLRS 

CAGGCGATCGCCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGC 2200 

QAIARGLAGRGAMASVA 
CCTGCCCGCGCAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCC 2250 
LPAQDVELVDGAWIAA 
45 ACAACGGGCCCGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGAC 2300 
HNGPASTVIAGT PEAVD 
CATGTCCTCACCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCAC 2350 

HVLTAHEAQGVRVRRIT 
CGTCGACTATGCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAAC 24 00 
50 VDYASHTPHVELIRDE 

TACTCGACATCACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGG 2450 
LLDITSDSSSQTPLVPW 
CTGTCGACCGTGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTA 2500 
LSTVDGTWVDS PLDGEY 
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CTGGTACCGGAACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCC 2550 

WYRNLREPVGFHPAVS 
AGTTGCAGGCCCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCG 2600 
QLQAQGDTVFVEVSAS P 
5 GTGTTGTTGCAGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCG 2650 
VLLQAMDDDVVTVATLR 
TCGTGACGACGGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCT 2700 

RDDGDATRMLTALAQA 
ATGTCCACGGCGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACA 2750 
10 YVHGVTVDWPAILGTTT 

ACCCGGGTACTGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTG 28 00 

TRVLDLPTYAFQHQRYW 
GCTCGAGTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGG 2850 
LESARPAASDAGHPVL 
15 GCTCCGGTATCGCCCTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTTCC 2900 
GSGIALAGSPGRVFTGS 
GTGCCGACCGGTGCGGACCGCGCGGTGTTCGTCGCCGAGCTGGCGCTGGC 2950 
VPTGADRAVFVAELALA 
O CGCCGCGGACGCGGTCGACTGCGCCACGGTCGAGCGGCTCGACATCGCCT 3000 

if} 20 AADAVDCATVERLDIA 
,|| CCGTGCCCGGCCGGCCGGGCCATGGCCGGACGACCGTACAGACCTGGGTC 3050 

JJ SVPGRPGHGRTTVQTWV 
S GACGAGCCGGCGGACGACGGCCGGCGCCGGTTCACCGTGCACACCCGCAC 3100 

Ui DEPADDGRRRFTVH T RT 

j*f 25 CGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTGCTGCGCCCCCATG 3150 

C GDAPWTLHAEGVLRPH 

-in 

%* a GCACGGCCCTGCCCGATGCGGCCGACGCCGAGTGGCCCCCACCGGGCGCG 3200 

^ G TALPDAADAEWPPPGA 

O GTGCCCGCGGACGGGCTGCCGGGTGTGTGGCGCCGGGGGGACCAGGTCTT 3250 

£Q 30 VPADGLPGVWRRGDQVF 

f|J CGCCGAGGCCGAGGTGGACGGACCGGACGGTTTCGTGGTGCACCCCGACC 3300 

%j AEAEVDGPDGFVVHPD 

TGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGAAGCCGCCAGCCGGCC 3350 
f: LLDAVFSAVGDGSRQPA 

35 GGATGGCGCGACCTGACGGTGCACGCGTCGGACGCCACCGTACTGCGCGC 3400 
GWRDLTVHAS DATVLRA 
CTGCCTCACCCGGCGCACCGACGGAGCCATGGGATTCGCCGCCTTCGACG 34 50 

CLTRRTDGAMGFAAFD 
GCGCCGGCCTGCCGGTACTCACCGCGGAGGCGGTGACGCTGCGGGAGGTG 3500 
40 GAGLPVLTAEAVTLREV 

GCGTCACCGTCCGGCTCCGAGGAGTCGGACGGCCTGCACCGGTTGGAGTG 3550 

ASPSGSEESDGLHRLEW 
GCTCGCGGTCGCCGAGGCGGTCTACGACGGTGACCTGCCCGAGGGACATG 3600 
LAVAEAVYDGDLPEGH 
45 TCCTGATCACCGCCGCCCACCCCGACGACCCCGAGGACATACCCACCCGC 3650 
VLITAAHPDDPEDIPTR 
GCCCACACCCGCGCCACCCGCGTCCTGACCGCCCTGCAACACCACCTCAC 37 00 

AHTRATRVLTALQHHLT 
CACCACCGACCACACCCTCATCGTCCACACCACCACCGACCCCGCCGGCG 3750 
50 TTDHTLIVHTTTDPAG 

CCACCGTCACCGGCCTCACCCGCACCGCCCAGAACGAACACCCCCACCGC 3800 
ATVTGLTRTAQNEHPHR 
ATCCGCCTCATCGAAACCGACCACCCCCACACCCCCCTCCCCCTGGCCCA 3850 
IRLIETDHPHTPLPLAQ 
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ACTCGCCACCCTCGACCACCCCCACCTCCGCCTCACCCACCACACCCTCC 3900 

LATLDHPHLRLTHHTL 
ACCACCCCCACCTCACCCCCCTCCACACCACCACCCCACCCACCACCACC 3950 
HHPHLTPLHTTTPPTTT 
CCCCTCAACCCCGAACACGCCATCATCATCACCGGCGGCTCCGGCACCCT 4000 

PLNPEHAIIITGGSGTL 
CGCCGGCATCCTCGCCCGCCACCTGAACCACCCCCACACCTACCTCCTCT 4050 

AGILARHLNHPHTYLL 
CCCGCACCCCACCCCCCGACGCCACCCCCGGCACCCACCTCCCCTGCGAC 4100 
SRTPPPDATPGTHLPCD 
GTCGGCGACCCCCACCAACTCGCCACCACCCTCACCCACATCCCCCAACC 4150 

VGDPHQLATTLTHI PQP 
CCTCACCGCCATCTTCCACACCGCCGCCACCCTCGACGACGGCATCCTCC 4200 

LTAI FHTAATLDDGIL 
ACGCCCTCACCCCCGACCGCCTCACCACCGTCCTCCACCCCAAAGCCAAC 4 250 
HALTPDRLTTVLHPKAN 
GCCGCCTGGCACCTGCACCACCTCACCCAAAACCAACCCCTCACCCACTT 4300 

AAWHLHHLTQNQPLTHF 
CGTCCTCTACTCCAGCGCCGCCGCCGTCCTCGGCAGCCCCGGACAAGGAA 4350 

VLYSSAAAVLGS PGQG 
ACTACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCAC 44 00 
NYAAANAFLDALAT HRH 
ACCCTCGGCCAACCCGCCACCTCCATCGCCTGGGGCATGTGGCACACCAC 44 50 

TLGQPATSIAWGMWHTT 
CAGCACCCTCACCGGACAACTCGACGACGCCGACCGGGACCGCATCCGCC 4500 

STLTGQLDDADRDRIR 
GCGGCGGTTTCCTCCCGATCACGGACGACGAGGGCATGGGGATGCAT 
RGGFLPITDDEG 

Phage KC515 DNA was prepared using the procedure described in Genetic 
Manipulation of Streptomyces, A Laboratory Manual, edited by D. Hopwood et ah A 
phage suspension prepared from 10 plates (100 mm) of confluent plaques of KC515 on S. 
lividans TK24 generally gave about 3 |ig of phage DNA. The DNA was ligated to 
circularize at the cos site, subsequently digested with restriction enzymes BamHL and 
Pstl, and dephosphorylated with SAP. 

Each module 8 cassette described above was excised with restriction enzymes 
Bglll and Nsil and ligated into the compatible BamRl and Pstl sites of KC515 phage 
DNA prepared as described above. The ligation mixture containing KC515 and various 
cassettes was transfected into protoplasts of Streptomyces lividans TK24 using the 
procedure described in Genetic Manipulation of Streptomyces, A Laboratory Manual 
edited by D. Hopwood et al. and overlaid with TK24 spores. After 16-24 hr, the plaques 
were restreaked on plates overlaid with TK24 spores. Single plaques were picked and 
resuspended in 200 of nutrient broth. Phage DNA was prepared by the boiling method 
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(Hopwood et a/., supra). The PCR with primers spanning the left and right boundaries of 
the recombinant phage was used to verify the correct phage had been isolated. In most 
cases, at least 80% of the plaques contained the expected insert. To confirm the presence 
of the resistance marker (thiostrepton), a spot test is used, as described in Lomovskaya et 
5 ah (1997), in which a plate with spots of phage is overlaid with mixture of spores of 
TK24 and phiC3 1 TK24 lysogen. After overnight incubation, the plate is overlaid with 
antibiotic in soft agar. A working stock is made of all phage containing desired 
constructs. 

Streptomyces hygroscopicus ATCC 14891 (see US Patent No. 3,244,592, issued 
10 5 Apr 1966, incorporated herein by reference) mycelia were infected with the 

recombinant phage by mixing the spores and phage (1 x 10 8 of each), and incubating on 
R2YE agar (Genetic Manipulation of Streptomyces, A Laboratory Manual, edited by D. 
Hopwood et al) at 30°C for 10 days. Recombinant clones were selected and plated on 
minimal medium containing thiostrepton (50 |ig/ml) to select for the thiostrepton 
1 5 resistance-conferring gene. Primary thiostrepton resistant clones were isolated and 
purified through a second round of single colony isolation, as necessary. To obtain 
J~ thiostrepton-sensitive revertants that underwent a second recombination event to evict the 

phage genome, primary recombinants were propagated in liquid media for two to three 
days in the absence of thiostrepton and then spread on agar medium without thiostrepton 
20 to obtain spores. Spores were plated to obtain about 50 colonies per plate, and 

thiostrepton sensitive colonies were identified by replica plating onto thiostrepton 
containing agar medium. The PCR was used to determine which of the thiostrepton 
sensitive colonies reverted to the wild type (reversal of the initial integration event), and 
which contain the desired AT swap at module 8 in the ATCC 14891 -derived cells. The 
25 PCR primers used amplified either the KS/AT junction or the AT/DH junction of the 
wild-type and the desired recombinant strains. Fermentation of the recombinant strains, 
followed by isolation of the metabolites and analysis by LCMS, and NMR is used to 
characterize the novel polyketide compounds. 



30 
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Example 2 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-506 
The present invention also provides the 13-desmethoxy derivatives of FK-506 and 
the novel PKS enzymes that produce them, A variety of Streptomyces strains that produce 
5 FK-506 are known in the art, including S. tsukubaensis No. 9993 (FERM BP-927), 
described in U.S. Patent No. 5,624,852, incorporated herein by reference; S. 
hygroscopicus subsp. yakushimaensis No. 7238, described in U.S. patent No. 4,894,366, 
incorporated herein by reference; S. sp. MA6858 (ATCC 55098), described in U.S. 
Patent Nos. 5,1 16,756, incorporated herein by reference; and S. sp. MA 6548, described 
10 in Motamedi et aL, 1998, "The biosynthetic gene cluster for the macrolactone ring of the 
% immunosuppressant FK-506," Eur. J. Biochem. 256: 528-534, and Motamedi et a/., 1997, 

41 "Structural organization of a multifunctional polyketide synthase involved in the 

p biosynthesis of the macrolide immunosuppressant FK-506," Eur. J. Biochem. 244: 74-80, 

ff each of which is incorporated herein by reference. 

If! 15 The complete sequence of the FK-506 gene cluster from Streptomyces sp. 

q MA6548 is known, and the sequences of the corresponding gene clusters from other FK- 

!fj 506-producing organisms is highly homologous thereto. The novel FK-506 recombinant 

Si gene clusters of the present invention differ from the naturally occurring gene clusters in 

suss. 
: 

2 that the AT domain of module 8 of the naturally occurring PKSs is replaced by an AT 

20 domain specific for malonyl CoA or methylmalonyl CoA. These AT domain 

replacements are made at the DNA level, following the methodology described in 
Example 1. 

The naturally occurring module 8 sequence for the MA6548 strain is shown 
below, followed by the illustrative hybrid module 8 sequences for the MA6548 strains. 

25 GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
30 RTTVRRAAVRERSLAD 

GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNS TATVLGHLGAEDI 
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CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
5 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 

DELAGT RAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
10 TAAAHDE PLAIVGMACR 

CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAITEFPADRGWDV 
15 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 

DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
O GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

%Q 20 I S PREALAMDPQQRVL 

yf] TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 

j; LETSWEAFESAGITPDA 
fC. GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
ff 25 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
Q GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

CO 30 VTVDTACSSSLVALHQA 
ffj AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

%j GQSLRSGECSLALVGG 
fi TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 

J VTVMAS PGGFVEFSRQR 

35 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
40 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANS DGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLT P 
45 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 1400 

ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 

PI EAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
50 PLLLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
ELPPTLHADEPSPHVDW 
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GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 
TGRPRRAAVS SFGVSGT 
5 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 

NAHI I LEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
10 GPLPAAPPSAPGEDLPL 

CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVSARS PEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
RAYLDTGPGVDRAAVA 
1 5 AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 

QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAACTCG 2100 
20 V Y S GQGTQHPAMGEQL 

CGGCCGCGTTCCCCGTGTTCGCCGATGCCTGGCACGACGCGCTCCGACGG 2150 
AAAFPVFADAWHDALRR 
CTCGACGACCCCGACCCGCACGACCCCACACGGAGCCAGCACACGCTCTT 2200 
LDDPDPHDPTRSQHTLF 
25 CGCCCACCAGGCGGCGTTCACCGCCCTCCTGAGGTCCTGGGACATCACGC 2250 

AHQAAFTALLRSWDIT 
CGCACGCCGTCATCGGCCACTCGCTCGGCGAGATCACCGCCGCGTACGCC 2300 
PHAVIGHSLGEITAAYA 
GCCGGGATCCTGTCGCTCGACGACGCCTGCACCCTGATCACCACGCGTGC 2350 
30 AGILSLDDACTLITTRA 

CCGCCTCATGCACACGCTTCCGCCGCCCGGCGCCATGGTCACCGTGCTGA 24 00 

RLMHTLPPPGAMVTVL 
CCAGCGAGGAGGAGGCCCGTCAGGCGCTGCGGCCGGGCGTGGAGATCGCC 24 50 
TSEEEARQALRPGVEIA 
35 GCGGTCTTCGGCCCGCACTCCGTCGTGCTCTCGGGCGACGAGGACGCCGT 2500 

AVFGPHSVVLSGDEDAV 
GCTCGACGTCGCACAGCGGCTCGGCATCCACCACCGTCTGCCCGCGCCGC 2550 

LDVAQRLGI HHRLPAP 
ACGCGGGCCACTCCGCGCACATGGAACCCGTGGCCGCCGAGCTGCTCGCC 2600 
40 HAGHSAHME PVAAELLA 

ACCACTCGCGAGCTCCGTTACGACCGGCCCCACACCGCCATCCCGAACGA 2650 

TTRELRYDRPHTAI PND 
CCCCACCACCGCCGAGTACTGGGCCGAGCAGGTCCGCAACCCCGTGCTGT 2700 
PTTAEYWAEQVRNPVL 
45 TCCACGCCCACACCCAGCGGTACCCCGACGCCGTGTTCGTCGAGATCGGC 2750 

FHAHTQRYPDAVFVEIG 
CCCGGCCAGGACCTCTCACCGCTGGTCGACGGCATCGCCCTGCAGAACGG 2800 

PGQDLS PLVDGIALQNG 
CACGGCGGACGAGGTGCACGCGCTGCACACCGCGCTCGCCCGCCTCTTCA 2850 
50 TADEVHALHTALARLF 

CACGCGGCGCCACGCTCGACTGGTCCCGCATCCTCGGCGGTGCTTCGCGG 2 900 
TRGATLDWSRILGGASR 
CACGACCCTGACGTCCCCTCGTACGCGTTCCAGCGGCGTCCCTACTGGAT 2950 
HDPDVPSYAFQRRPYWI 
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CGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCA 3000 

ESAPPATADSGHPVLG 
CCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTG 3050 
TGVAVAGS PGRVFTGPV 
5 CCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGC 3100 

PAGADRAVFIAELALAA 
CGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCG 3150 

ADATDCATVEQLDVTS 
TGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGAT 3200 
10 VPGGSARGRATAQTWVD 

GAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGG 3250 

EPAADGRRRFTVHTRVG 
CGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCG 3300 
DAPWTLHAEGVLRPGR 
1 5 TGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTG 3350 

VPQPEAVDTAWPPPGAV 
CCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGT 34 0 0 
_ PADGLPGAWRRADQVFV 

0 CGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGC 34 50 
J3 20 EAEVDS PDGFVAHPDL 

TCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGA 3500 
jg LDAVFSAVGDGSRQPTG 
X TGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTG 3550 

^ WRDLAVHAS DATVLRAC 

fj 25 CCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTG 3600 

lZ LTRRDSGVVELAAFDG 

0 1 CCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCG 3650 
s AGMPVLTAESVTLGEVA 

Q TCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTT 3700 

HJ 30 SAGGSDESDGLLRLEWL 
jjXj GCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCT 3750 

%* PVAEAHYDGADELPEG 
q ACACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAAC 3800 

?J YTLITATHPDDPDDPTN 
pa 35 CCCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCAC 3850 

PHNTPTRTHTQTTRVLT 
CGCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACA 3900 

ALQHHLITTNHTLIVH 
CCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCA 3950 
40 TTTDPPGAAVTGLTRTA 

CAAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCA 4 000 

QNEHPGRIHLIETHHPH 
CACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTAC 4050 
TPLPLTQLTTLHQPHL 
45 GCCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACC 4100 

RLTNNTLHTPHLT PITT 
CACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAA 4150 

HHNTTTTTPNTPPLNPN 
CCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCG 4200 
50 HAILITGGSGTLAGIL 

CCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCA 4250 
ARHLNHPHTYLLSRTPP 
CCCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCAC 4300 
PPTTPGTHIPCDLTDPT 
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CCAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCT 4 350 

QITQALTHIPQPLTGI 
TCCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCC 4 400 
FHTAATLDDATLTNLTP 
CAACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCT 4 4 50 

QHLTTTLQPKADAAWHL. 
CCACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCA 4 500 

HHHTQNQPLTHFVLYS 
GCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCC 4 550 
SAAATLGS PGQANYAAA 
AACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACC 4 600 

NAFLDALATHRHTQGQP 
CGCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCA 4 650 

ATTIAWGMWHTTTTLT 
GCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTG 47 00 
SQLTDSDRDRIRRGGFL 
CCGATCTCGGACGACGAGGGCATGC 

PI SDDEGM 

The Avrll-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 12 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RT TVRRAAVRERS LAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGIDSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAHDE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 

GTDAITEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 

HGGFLDGATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

I S P REALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
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LETSWEAFESAGITPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
5 GTGA DTNGFGATGSQT 

GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTAC S S S LVAL HQA 
10 AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
15 GLAPDGRAKAFGAGADG 

TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
3 DAERHGHTVLALVRGSA 
f| 20 GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 
Q ANSDGASNGLSAPNGPS 
S CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 

QERVI HQALANAKLT P 
CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
25 ADVDAVEAHGT GTRLGD 
?Z CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 

BH PIEAQALLATYGQDRAT 
- GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

Q PLLLGSLKSNIGHAQA 
Cg 30 CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
flj ASGVAGI IKMVQAIRHG 

GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

ELPPTLHADEPSPHVDW 
GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
35 TAGAVELLTSARPWPG 

CCGGTCGCCCTAGGCGGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACC 17 00 
TGRPRRAGVSS FGI SGT 
AACGCCCACGTCATCCTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAA 1750 
NAHVILESAPPTQPADN 
40 CGCGGTGATCGAGCGGGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCA 1800 
AVIERAPEWVPLVISA 
GGACCCAGTCGGCTTTGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTG 1850 
RTQSALTEHEGRLRAYL 
GCGGCGTCGCCCGGGGTGGATATGCGGGCTGTGGCATCGACGCTGGCGAT 1900 
45 AAS PGVDMRAVAS T LAM 

GACACGGTCGGTGTTCGAGCACCGTGCCGTGCTGCTGGGAGATGACACCG 1950 

TRSVFEHRAVLLGDDT 
TCACCGGCACCGCTGTGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGA 2000 
VTGTAVS DPRAVFVFPG 
50 CAGGGGTCGCAGCGTGCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCC 2050 
QGSQRAGMGEELAAAFP 
CGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCG 2100 

VFARIHQQVWDLLDVP 
ATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATG 2150 
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DLEVNETGYAQPALFAM 
CAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTGTACGACCGGACGC 2200 

QVALFGLLESWGVRPDA 
GGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGG 2250 
5 VI GHSVGELAAAYVS G 

TGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTG 2300 
VWSLEDACTLVSARARL 
ATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGA 2350 
MQALPAGGVMVAVPVSE 
1 0 GGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCA 24 00 
DEARAVLGEGVE IAAV 
ACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAG 24 50 
NGPSSVVLSGDEAAVLQ 
GCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTT 2500 
15 AAEGLGKWTRLAT S HAF 

CCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCG 2550 

HSARMEPMLEE FRAVA 
AAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAG 2600 
EGLTYRTPQVSMAVGDQ 
20 GTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTT 2650 
VTTAEYWVRQVRDTVRF 
CGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTG 2700 

GEQVASYEDAVFVELG 
CCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGC 2750 
25 ADRSLARLVDGVAMLHG 

GACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAA 28 00 

DHEIQAAIGALAHLYVN 
CGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACAC 2850 
GVTVDWPALLGDAPAT 
30 GGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCAGCGCTACTGGCTC 2 900 
RVLDLPTYAFQHQRYWL 
GAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCAC 2 950 

ESAPPATADSGHPVLGT 
CGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGC 3000 
35 GVAVAGS PGRVFTGPV 

CCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCC 3050 
PAGADRAVFIAELALAA 
GCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGT 3100 
ADATDCATVEQLDVTSV 
40 GCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATG 3150 
PGGSARGRATAQTWVD 
AACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGC 3200 
EPAADGRRRFTVHTRVG 
GACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGT 3250 
45 DAPWTLHAEGVLRPGRV 

GCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGC 3300 

PQPEAVDTAWPPPGAV 
CCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTC 3350 
PADGLPGAWRRADQVFV 
50 GAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCT 34 00 
EAEVDSPDGFVAHPDLL 
CGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGAT 34 50 

DAVFSAVGDGSRQPTG 
GGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGC 3500 
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WRDLAVHAS DATVLRAC 
CTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGC 3550 

LTRRDS GVVELAAFDGA 
CGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGT 3600 

GMPVLTAESVTLGEVA 
CGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTG 3650 
SAGGSDESDGLLRLEWL 
CCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTA 3700 

PVAEAHYDGADELPEGY 
CACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACC 3750 

TLITATHPDDPDDPTN 
CCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACC 3800 
PHNTPTRTHTQTTRVLT 
GCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACAC 3850 

ALQHHLITTNHTLIVHT 
CACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC 3900 

TTDPPGAAVTGLTRTA 
AAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCAC 3950 
QNEHPGRIHLIETHHPH 
ACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACG 4000 

TPLPLTQLTTLHQPHLR 
CCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCC 4050 

LTNNTLHTPHLTPITT 
ACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAAC 4100 
HHNTTTTTPNTPPLNPN 
CACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGC 4150 

HAILITGGSGTLAGILA 
CCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCAC 4 200 

RHLNHPHTYLLSRTPP 
CCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACC 4250 
PPTTPGTHIPCDLTDPT 
CAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTT 4300 

QITQALTHIPQPLTGIF 
CCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCC 4350 

HTAATLDDATLTNLTP 
AACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTC 4 400 
QHLTTTLQPKADAAWHL 
CACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAG 4 4 50 

HHHTQNQPLTHFVLYSS 
CGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCA 4 500 

AAATLGS PGQANYAAA 
ACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCC 4 550 
NAFLDALATHRHTQGQP 
GCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAG 4 600 

ATT IAWGMWHTTTTLTS 
CCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGC 4 650 

QLTDSDRDRIRRGGFL 
CG AT C T C G G ACG AC GAG G G C AT GC 
PISDDEGM 

The Avrll-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 
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GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDVPLLRGLR 
5 GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
10 SWNSTATVLGHLGAEDI 

CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
1 5 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 
TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
20 TAAAHDE PLAIVGMACR 

CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAI TE FPADRGWDV 
25 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
■DALY DP DP DAI GKT FVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 

HGGFLDGATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 
30 ISPREALAMDPQQRVL 

TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGITPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
ARGS D T GV 'F I GAFS YG Y 
35 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
40 VTVDTACS S SLVALHQA 

AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
45 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
50 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLTP 
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CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
P I EAQALLATYGQDRAT 
5 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAG* I IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
10 ELPPTLHADEPS PHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCTAGGCGGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACC 1700 
TGRPRRAGVSSFGVSGT 
15 AACGCCCACGTCATCCTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGA 1750 
NAHVILESAPPAQPAEE 
GGCGCAGCCTGTTGAGACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGG 1800 
AQPVETPVVASDVLPL 
O TGATATCGGCCAAGACCCAGCCCGCCCTGACCGAACACGAAGACCGGCTG 1850 

J 20 VI SAKTQPALTEHE DRL 
: J CGCGCCTACCTGGCGGCGTCGCCCGGGGCGGATATACGGGCTGTGGCATC 1900 

jS RAYLAAS PGADI RAVAS 

jr GACGCTGGCGGTGACACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTG 1950 

TLAVTRSVFEHRAVLL 
^ 25 GAGATGACACCGTCACCGGCACCGCGGTGACCGACCCCAGGATCGTGTTT 2000 
H GDDTVTGTAVTDPRIVF 
ffl GTCTTTCCCGGGCAGGGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCG 2050 

3 VFPGQGWQWLGMGSALR 
Q CGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGT 21 0 0 

fll 30 D S SVVFAERMAECAAA 

f|1 TGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGCG 2150 

^5 LREFVDWDLFTVLDDPA 
^ GTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGT 22 00 

^ VVDRVDVVQPASWAMMV 
fe*st TTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGA 2250 

S LAAVWQAAGVRP DAV 
TCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTG 2300 
I GHSQGEIAAACVAGAV 
TCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGC 2350 
40 SLRDAARIVTLRSQAIA 

CCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGC 2400 

RGLAGRGAMASVAL PA 
AGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCCC 24 5 0 
QDVELVDGAWIAAHNGP 
45 GCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCAC 2500 
AS TVIAGTPEAVDHVLT 
CGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTATG 2550 

AHEAQGVRVRRI TVDY 
CCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACATC 2600 
50 ASHTPHVELIRDELLDI 

ACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGT 2650 

TSDSSSQTPLVPWLSTV 
GGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGGA 27 00 
DGTWVDSPLDGEYWYR 
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ACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCC 27 50 
NLREPVGFHPAVSQLQA 
CAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCA 2800 
QGDTVFVEVSASPVLLQ 
5 GGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGACG 2850 
AMDDDVVTVATLRRDD 
GCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGGC 2 900 
GDAT RMLTALAQAYVHG 
GTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTACT 2950 
10 VTVDWPAILGTTTTRVL 

GGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGG 3000 

DLPTYAFQHQRYWLES 
CTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGTC 3050 
APPATADSGHPVLGTGV 
1 5 GCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCGG 3100 
AVAGS PGRVFTGPVPAG 
TGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGACG 3150 

ADRAVFIAELALAAAD 
CCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGGC 3200 
20 ATDCATVEQLDVTSVPG 

GGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCGC 3250 
GSARGRATAQTWVDEPA 
q CGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCCC 3300 

j^i ADGRRRFTVHTRVGDA 
fj 25 CGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCAG 3350 
^ PWTLHAEGVLRPGRVPQ 
y 1 CCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGGA 34 00 

s PEAVDTAWPPPGAVPAD 
O CGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCCG 34 5 0 

gg 30 GLPGAWRRADQVFVEA 
fy AAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGCG 350 0 

%A EVDS PDGFVAHPDLLDA 

fi GTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCGA 3550 

VFSAVGDGSRQPTGWRD 
35 CCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACCC 3600 
LAVHAS DATVLRACLT 
GCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAATG 3650 
RR'DSGVVELAAFDGAGM 
CCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAGG 37 00 
40 PVLTAESVTLGEVASAG 

CGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTGG 37 50 

GSDESDGLLRLEWLPV 
CGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCTC 3800 
AEAHYDGADELPEGYTL 
45 ATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACAA 3850 
ITATHPDDPDDPTNPHN 
CACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTCC 3 900 

TPTRTHTQTTRVLTAL 
AACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCACC 3950 
50 QHHLI TTNHTLIVHTTT 

GACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACGA 4000 

DPPGAAVTGLTRTAQNE 
ACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCAC 4 050 
HPGRIHLIETHHPHTP 
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TCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCACC 4100 
LPLTQLTTLHQPHLRLT 
AACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACAA 4150 
NNTLHTPHLTPITTHHN 
5 CACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCCA 4200 
TTTTTPNTPPLNPNHA 
TCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCAC 4250 
ILITGGSGTLAGILARH 
CTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCAC 4300 
10 LNHPHTYLLSRTPPPPT 

CACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATCA 4350 

TPGTHIPCDLTDPTQI 
CCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACACC 4 4 00 
TQALTHIPQPLTGIFHT 
1 5 GCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACCT 4 450 
AATLDDATLTNLT PQHL 
CACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCACC 4500 
) TTTLQPKADAAWHLHH 
O ACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCC 4550 

tfj 20 HTQNQPLTHFVLYSSAA 
J3 GCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCTT 4 600 

JE ATLGSPGQANYAAANAF 

CCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACCA 4 600 
^ LDALATHRHTQGQPAT 

25 CCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACTC 4 7 00 
2 TIAWGMWHTTTTLTSQL 
^fl ACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCTC 4 7 50 

s ' ' TDSDRDRIRRGGFLPIS 
Q GG AC GACG AGGGC AT GC 

SQ 30 D D E G M 

The NheVXhol hybrid FK-506 PKS module 8 containing the AT domain of 
?f module 12 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
35 MRLYEAARRTGS PVVV 

GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAAL DDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERS LAD 
40 GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNS TATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
45 PATTTFKELGIDSLTA 

TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAVFDFPT PRALAARLG 
50 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 
DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
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TAAAHDEPLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVASPQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
5 GTDAITEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
10 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
IS PREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGI TPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
15 ARGSDTGVFIGAFSYGY 

CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
20 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
25 VTVMASPGGFVEFSRQR 

GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
TS FAEGAGALVVERLS 
30 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
35 QERVI HQALANAKLT P 

CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PIEAQALLATYGQDRAT 
40 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
45 ELPPTLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 
TGRPRRAAVSSFGVSGT 
50 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 
NAHI ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAIEAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
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GPLPAAPPSAPGEDLPL 
CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVSARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
5 RAYLDTGPGVDRAAVA 

AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 
DTVIGAPPADQADELVF 
10 CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VYSGQGTQHPAMGEQL 
CCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTG 2150 
AAAFPVFARIHQQVWDL 
CTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGC 2200 
15 LDVPDLEVNETGYAQPA 

CCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTG 2250 

LFAMQVALFGLLESWG 
TACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCG 2300 
VRPDAVIGHSVGELAAA 
20 TATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGC 2350 
YVSGVWSLEDACTLVSA 
GCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTG 2400 

RARLMQAL PAGGVMVA 
TCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAG 2450 
25 VPVSEDEARAVLGEGVE 

ATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGC 2500 

IAAVNGPSSVVLSGDEA 
CGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGA 2550 
AVLQAAEGLGKWTRLA 
30 CCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTC 2 600 
TSHAFHSARMEPMLEEF 
CGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGC 2 650 

RAVAEGLTYRT PQVSMA 
CGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGG 27 00 
35 VGDQVTTAEYWVRQVR 

ACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTC 2750 
DTVRFGEQVASYEDAVF 
GTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGC 28 00 
VELGADRSLARLVDGVA 
40 GATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCC 2850 
MLHGDHEIQAAIGALA 
ACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGAT 2 900 
HLYVNGVTVDWPALLGD 
GCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCA 2 950 
45 APATRVLDLPTYAFQHQ 

GCGCTACTGGCTCGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACC 3000 

RYWLESAPPATADSGH 
CCGTCCTCGGCACCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTC 3050 
PVLGTGVAVAGSPGRVF 
50 ACGGGTCCCGTGCCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACT 3100 
TGPVPAGADRAVFIAEL 
GGCGCTCGCCGCCGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCG 3150 

ALAAADAT DCATVEQL 
ACGTCACCTCCGTGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAG 3200 
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DVTSVPGGSARGRATAQ 
ACCTGGGTCGATGAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCA 3250 

TWVDEPAADGRRRFTVH 
CACCCGCGTCGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCC 3300 
5 TRVGDAPWTLHAEGVL 

GCCCCGGCCGCGTGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCG 3350 
RPGRVPQPEAVDTAWPP 
CCGGGCGCGGTGCCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGA 3400 
PGAVPADGLPGAWRRAD 
10 CCAGGTCTTCGTCGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCAC 3450 
QVFVEAEVDS PDGFVA 
ACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGC 3500 
HPDLLDAVFSAVGDGSR 
CAGCCGACCGGATGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGT 3550 
15 Q PTGWRDLAVHAS DATV 

GCTGCGCGCCTGCCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCG 3600 

LRACLTRRDSGVVELA 
CCTTCGACGGTGCCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTG 3650 
Q A FDGAGMPVLTAESVTL 

V Q 20 GGCGAGGTCGCGTCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCG 3700 
yr| GEVASAGGSDESDGLLR 
jS GCTTGAGTGGTTGCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGC 3750 

% LEWLPVAEAHYDGADE 
H TGCCCGAGGGCTACACCCTCATCACCGCCACACACCCCGACGACCCCGAC 3800 

25 LPEGYTLITATHPDDPD 
^ GACCCCACCAACCCCCACAACACACCCACACGCACCCACACACAAACCAC 3850 

yl DPTNPHNTPTRTHTQTT 
* ACGCGTCCTCACCGCCCTCCAACACCACCTCATCACCACCAACCACACCC 3900 

O RVLTALQHHLITTNHT 
rg 30 TCATCGTCCACACCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTC 3950 
fl| LIVHTTTDPPGAAVTGL 

AC CC G C AC CG C AC AAAAC G AAC ACC C C G G CC G CAT C C ACC T CAT C G AAAC 4 000 
*4 TRTAQNEHPGRIHLIET 
^ CCACCACCCCCACACCCCACTCCCCCTCACCCAACTCACCACCCTCCACC 4 050 

r " 35 HHPHTPLPLTQLTTLH 

AACCCCACCTACGCCTCACCAACAACACCCTCCACACCCCCCACCTCACC 4100 
QPHLRLTNNTLHTPHLT 
CCCATCACCACCCACCACAACACCACCACAACCACCCCCAACACCCCACC 4150 
PITTHHNTTTTTPNTPP 
40 CCTCAACCCCAACCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCG 4200 
LNPNHAILITGGSGTL 
CCGGCATCCTCGCCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCC 4 250 
AGILARHLNHPHTYLLS 
CGCACACCACCACCCCCCACCACACCCGGCACCCACATCCCCTGCGACCT 4 300 
45 RTPPPPTTPGTHIPCDL 

CACCGACCCCACCCAAATCACCCAAGCCCTCACCCACATACCACAACCCC 4350 

TDPTQITQALTHI PQP 
TCACCGGCATCTTCCACACCGCCGCCACCCTCGACGACGCCACCCTCACC 4400 
LTGI FHTAATLDDATLT 
50 AACCTCACCCCCCAACACCTCACCACCACCCTCCAACCCAAAGCCGACGC 4 450 
NLTPQHLTTTLQPKADA 
CGCCTGGCACCTCCACCACCACACCCAAAACCAACCCCTCACCCACTTCG 4500 

AWHLHHHTQNQPLTHF 
TCCTCTACTCCAGCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAAC 4550 
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VLYSSAAATLGS PGQAN 
TACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACAC 4 600 

YAAANAFLDALATHRHT 
CCAAGGACAACCCGCCACCACCATCGCCTGGGGCATGTGGCACACCACCA 4 65 0 

QGQPATTIAWGMWHTT 
CCACACTCACCAGCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGC 4 7 00 
TTLTSQLTDSDRDRIRR 
GGCGGCTTCCTGCCGATCTCGGACGACGAGGGCATGC 
GGFLPI SDDEGM 

The NheVXhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
MRLYEAARRTGSPVVV 
1 5 GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDVPLLRGLR 
^ GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

% RTTVRRAAVRERSLAD 
^ GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 

^20 RSPCCPTTSAPTPPSRS 
4* TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

Q SWN S TATVLGHLGAE D I 

Ljy CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

Mi PATTTFKELGIDSLTA 
f|l 25 TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
~ VQLRNALTTATGVRLNA 
Q ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 

i£ TAVFDFPTPRALAARLG 
21 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

^30 DELAGT RAPVAARTAA 

CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
Q TAAAHDE PLAIVGMACR 

M» CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
35 CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAI TEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 
40 HGGFLDGATGFDAAFFG 

GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

I SPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGITPDA 
45 GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
.GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
50 SVLSGRLSYFYGLEGPS 

GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
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AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS PGGFVEFSRQR 
5 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
10 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLTP 
1 5 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 
PIEAQALLATYGQDRAT 
p GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

20 PLLLGSLKSNIGHAQA 
^ CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 

ASGVAGI I KMVQAI RHG 
31 GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

ELPPTLHADEPSPHVDW 
25 GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
if' TAGAVELLTSARPWPG 
01 CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 

s TGRPRRAAVSSFGVSGT 
£| AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 

30 NAHI I LEAGPVKTGPVE 

GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
GPLPAAPPSAPGEDLPL 
35 CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 
LVSARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 

RAYLDTG PGVDRAAVA 
AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
40 QTLARRTHFTHRAVLLG 

GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VY'SGQGTQHPAMGEQL 
45 CCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCG 2150 
ADSSVVFAERMAECAAA 
TTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGC 2200 

LREFVDWDLFTVLDDPA 
GGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGG 2250 
50 VVDRVDVVQPASWAMM 

TTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTG 2300 
VSLAAVWQAAGVRPDAV 
ATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGT 2350 
I GHSQGE IAAACVAGAV 
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GTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCG 24 00 

SLRDAARIVTLRSQAI 
CCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCG 24 50 
ARGLAGRGAMASVALPA 
5 CAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCC 2500 
QDVELVDGAWIAAHNGP 
CGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCA 25 50 

AS TVIAGTPEAVDHVL 
CCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTAT 2600 
10 TAHEAQGVRVRRI TVDY 

GCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACAT 2 650 

ASHTPHVELIRDELLDI 
CACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCG 2700 
TSDSSSQTPLVPWLST 
1 5 TGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGG 2750 
VDGTWVDSPLDGEYWYR 
AACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGC 2800 
NLREPVGFHPAVSQLQA 
p CCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGC 2850 

^20 QGDTVFVEVSASPVLL 
,fi AGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGAC 2900 

% QAMDDDVVTVATLRRDD 
J! GGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGG 2950 

H! G DAT RMLTALAQAYVHG 

25 CGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTAC 3000 
VTVDWPAILGTTTTRV 
CH TGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCG 3050 

si LDLPTYAFQHQRYWLES 
p GCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGT 3100 

I 30 APPATADSGHPVLGTGV 

CGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCG 3150 
5y AVAGS PGRVFTGPVPA 

GTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGAC 3200 
GADRAVFIAELALAAAD 
35 GCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGG 3250 
ATDCATVEQLDVTSVPG 
CGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCG 3300 

GSARGRATAQTWVDEP 
CCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCC 3350 
40 AADGRRRFTVHTRVGDA 

CCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCA 3400 

PWTLHAEGVLRPGRVPQ 
GCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGG 3450 
PEAVDTAWPPPGAVPA 
45 ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCC 3500 
DGLPGAWRRADQVFVEA 
GAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGC 3550 

EVDS PDGFVAHPDLLDA 
GGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCG 3600 
50 VFSAVGDGSRQPTGWR 

ACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACC 3650 
DLAVHAS DATVLRACLT 
CGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAAT 3700 
RRDSGVVELAAFDGAGM 
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GCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAG 37 50 

PVLTAESVTLGEVASA 
GCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTG 3800 
GGSDESDGLLRLEWLPV 
5 GCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCT 3850 
AEAHYDGADELPEGYTL 
CATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACA 3900 

ITATHPDDPDDPTNPH 
ACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTC 3950 
10 NTPTRTHTQTTRVLTAL 

CAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCAC 4000 

QHHLITTNHTLIVHTTT 
CGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACG 4050 
DPPGAAVTGLTRTAQN 
15 AACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCA 4100 
EHPGRIHLIETHHPHTP 
CTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCAC 4150 
LPLTQLTTLHQPHLRLT 
O CAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACA 4200 

20 NNTLHTPHLTPITTHH 

ACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCC 4250 
% NTTTTTPNTPPLNPNHA 
J! ATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCA 4300 

H I LI TGGSGTLAGILARH 

U* 25 CCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCA 4350 
►f'' LNHPHTYLLSRTPPPP 
"U £ CCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATC 44 00 

s TTPGTHI PCDLTDPTQI 

O ACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACAC 44 50 

gg 30 TQALTHI PQPLTGIFHT 
f(\ CGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACC 4 500 

LI AATLDDATLTNLTPQH 
^ TCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCAC 4 550 

LTTTLQPKADAAWHLHH 
H 35 CACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGC 4 600 
HTQNQPLTHFVLYSSAA 
CGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCT 4 650 

ATLGS PGQANYAAANA 
TCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACC 47 00 
40 FLDALATHRHTQGQPAT 

ACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACT 4750 

T IAWGMWHTTTTLTSQL 
CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCT 4800 
TDS DRDRIRRGGFLPI 
45 CGGACGACGAGGGCATGC 
S D D E G M 

Example 3 

Recombinant PKS Genes for 13-desmethoxy FK-506 and FK-520 
50 The present invention provides a variety of recombinant PKS genes in addition to 

those described in Examples 1 and 2 for producing 13-desmethoxy FK-506 and FK-520 
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compounds. This Example provides the construction protocols for recombinant FK-520 
and FK-506 (from Streptomyces sp. MA6858 (ATCC 55098), described in U.S. Patent 
Nos. 5,1 16,756, incorporated herein by reference) PKS genes in which the module 8 AT 
coding sequences have been replaced by either the rap AT3 (the AT domain from module 
3 of the rapamycin PKS), rapAT12, ery AT I (the AT domain from module 1 of the 
erythromycin (DEBS) PKS), or ery ATI coding sequences. Each of these constructs 
provides a PKS that produces the 13-desmethoxy-13-methyl derivative, except for the 
rapAT12 replacement, which provides the 13-desmethoxy derivative, i.e., it has a 
hydrogen where the other derivatives have methyl. 

Figure 7 shows the process used to generate the AT replacement constructs. First, 
a fragment of -4.5 kb containing module 8 coding sequences from the FK-520 cluster of 
ATCC 14891 was cloned using the convenient restriction sites Sacl and Sphl (Step A in 
Figure 7). The choice of restriction sites used to clone a 4.0 - 4.5 kb fragment comprising 
module 8 coding sequences from other FK-520 or FK-506 clusters can be different 
depending on the DNA sequence, but the overall scheme is identical. The unique Sacl 
and Sphl restriction sites at the ends of the FK-520 module 8 fragment were then changed 
to unique Bgl II and Nsil sites by ligation to synthetic linkers (described in the preceding 
Examples, see Step B of Figure 7). Fragments containing sequences 5' and 3' of the AT8 
sequences were then amplified using primers, described above, that introduced either an 
Avrll site or an Nhel site at two different KS/AT boundaries and mXhol site at the 
AT/DH boundary (Step C of Figure 7). Heterologous AT domains from the rapamycin 
and erythromycin gene clusters were amplified using primers, as described above, that 
introduced the same sites as just described (Step D of Figure 7). The fragments were 
ligated to give hybrid modules with in-frame fusions at the KS/AT and AT/DH 
boundaries (Step E of Figure 7). Finally, these hybrid modules were ligated into the 
BamHl and Pstl sites of the KC515 vector. The resulting recombinant phage were used to 
transform the FK-506 and FK-520 producer strains to yield the desired recombinant cells, 
as described in the preceding Examples. 



dc-176500 



PATENT 

AttyDkt: 300622002600 

-125- 

The following table shows the location and sequences surrounding the engineered 
site of each of the heterologous AT domains employed. The FK-506 hybrid construct was 
used as a control for the FK-520 recombinant cells produced, and a similar FK-520 
hybrid construct was used as a control for the FK-506 recombinant cells. 



Heterologous AT 


Enzyme 


Location of Engineered Site 


FK-506 AT8 
(hydroxymalonyl) 


^4vrII 
Nhel 


GGCCGTccgcgcCGTGCGGCGGTCTCGTCGTTC 

GRPRRAAVSSF 
ACCCAGCATCCCGCGATGGGTGAGCGgctcgcC 




TQHPAMGERLA 




Xhol 


TACGCCTTCCAGCGGCGGCCCTACTGGatcgag 
YAFQRRPYWIE 


rapamycin AT3 
(methylmalonyl) 


A TT 

Avrll 
Nhel 


GACCGGccccgtCGGGCGGGCGTGTCGTCCTTC 
UKFKKAbvbor 

TGGCAGTGGCTGGGGATGGGCAGTGCcctgcgG 
WQWLGMGSALR 




Xhol 


TACGCCTTCCAACACCAGCGGTACTGGgtcgag 




YAFQHQRYWVE 


rapamycin AT 12 
(malonyl) 


A TT 

Nhel 


GGCCGAgcgcgcCGGGCAGGCGTGTCGTCCTTC 

GRARRAGVSSF 
TCGCAGCGTGCTGGCATGGGTGAGGAactggcC 




SQRAGMGEELA 




Xhol 


TACGCCTTCCAGCACCAGCGCTACTGGctcgag 
YAFQHQRYWLE 


DEBS ATI 
(methylmalonyl) 


Nhel 


GCGCGAccgcgcCGGGCGGGGGTCTCGTCGTTC 

ARPRRAGVSSF 
TGGCAGTGGGCGGGCATGGCCGTCGAcctgctC 




WQWAGMAVDLL 




Xhol 


TACCCGTTCCAGCGCGAGCGCGTCTGGctcgaa 




YPFQRERVWLE 


DEBS AT2 
(methylmalonyl) 


Avrll 
Nhel 


GACGGGgtgcgcCGGGCAGGTGTGTCGGCGTTC 
DGVRRAGVSAF 

GCCCAGTGGGAAGGCATGGCGCGGGAgttgttG 
AQWEGMARELL 




Xhol 


TATCCTTTCCAGGGCAAGCGGTTCTGGctgctg 
YPFQGKRFWLL 
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The sequences shown below provide the location of the KS/AT boundaries 
chosen in the FK-520 module 8 coding sequences. Regions where Avrll and Nhel sites 
were engineered are indicated by lower case and underlining. 

CCGGCGCCGTCGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGG ccacgq C 

AGAVELLTSARPWPETDRPR 

GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATCCTGGAGGCCG 

RAAVSSFGVSGTNAHVILEA 

GACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGACCTTCCCCTGCTGGTGTCGG 

GPVTETPAASPSGDLPLLVS 

CACGCTCACCGGAAGCGCTCGACGAGCAGATCCGCCGACTGCGCGCCTACCTGGACACCA 

ARSPEALDEQIRRLRAYLDT 

CCCCGGACGTCGACCGGGTGGCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCC 

T PDVDRVAVAQTLARRT H FA 

ACCGCGCCGTGCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 

HRAVLLGDTVITTPPADRPD 

AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGCGAGC Agctcg 

ELVFVYSGQGTQHPAMGEQL 

cCGCCGCCCATCCCGTGTTCGCCGACGCCTGGCATGAAGCGCTCCGCCGCCTTGACAACC 

AAAHPVFADAWHEALRRLDN 

The sequences shown below provide the location of the AT/DH boundary chosen 
in the FK-520 module 8 coding sequences. The region where an Xhol site was 
engineered is indicated by lower case and underlining. 

TCCTCGGGGCTGGGTCACGGCACGACGCGGATGTGCCCGCGTACGCGTTCCAACGGCGGC 
ILGAGSRHDADVPAYAFQRR 
ACTACTGG atcgaq TCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGGGCT 
HYWIESARPAASDAGH PVLG 

The sequences shown below provide the location of the KS/AT boundaries 
chosen in the FK-506 module 8 coding sequences. Regions where ^4vrII and Nhel sites 
were engineered are indicated by lower case and underlining. 

TCGGCCAGGCCGTGGCCGCGGACCGGCCGT ccgcgc CGTGCGGCGGTCTCGTCGTTCGGG 

SARPWPRTGRPRRAAVS SFG 
GTGAGCGGCACCAACGCCCACATCATCCTGGAGGCCGGACCCGACCAGGAGGAGCCGTCG 

VSGTNAHI ILEAGPDQEEPS 
GCAGAACCGGCCGGTGACCTCCCGCTGCTCGTGTCGGCACGGTCCCCGGAGGCACTGGAC 

AEPAGDLPLLVSARS PEAL D 
GAGCAGATCGGGCGCCTGCGCGACTATCTCGACGCCGCCCCCGGCGTGGACCTGGCGGCC 
EQ IGRLRDYLDAAPGVDLAA 
GTGGCGCGGACACTGGCCACGCGTACGCACTTCTCCCACCGCGCCGTACTGCTCGGTGAC 

VARTLATRTHFSHRAVL LGD 
ACCGTCATCACCGCTCCCCCCGTGGAACAGCCGGGCGAGCTCGTCTTCGTCTACTCGGGA 

TVITAPPVEQPGELVFVYSG 
CAGGGCACCCAGCATCCCGCGATGGGTGAGCG gctcgc CGCAGCCTTCCCCGTGTTCGCC 

QGTQHPAMGERLAAAFPVFA 
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GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGATCGAGTCCGCGCCG 
DPDVPAYAFQRRPYWIESAP 

The sequences shown below provide the location of the AT/DH boundary chosen 
5 in the FK-506 module 8 coding sequences. The region where anXhol site was 
engineered is indicated by lower case and underlining. 

GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGG atcqaq TCCGCGCCG 
DPDVPAYAFQRRPYWIESAP 

10 Example 4 

Replacement of Methoxyl with Hydrogen or Methyl at C-15 of FK-506 and FK-520 
The methods and reagents of the present invention also provide novel FK-506 and 
FK-520 derivatives in which the methoxy group at C-15 is replaced by a hydrogen or 
methyl. These derivatives are produced in recombinant host cells of the invention that 

1 5 express recombinant PKS enzymes the produce the derivatives. These recombinant PKS 
enzymes are prepared in accordance with the methodology of Examples 1 and 2, with the 
exception that AT domain of module 7, instead of module 8, is replaced. Moreover, the 
present invention provides recombinant PKS enzymes in which the AT domains of both 
modules 7 and 8 have been changed. The table below summarizes the various compounds 

20 provided by the present invention. 



25 



30 



Compound 


C-13 


C-15 


Derivative Provided 


FK-506 


hydrogen 


hydrogen 


13, 15-didesmethoxy-FK-506 


FK-506 


hydrogen 


methoxy 


13-desmethoxy-FK-506 


FK-506 


hydrogen 


methyl 


1 3 , 1 5-didesmethoxy- 1 5-methyl-FK-506 


FK-506 


methoxy 


hydrogen 


1 5-desmethoxy-FK-506 


FK-506 


methoxy 


methoxy 


Original Compound - FK-506 


FK-506 


methoxy 


methyl 


1 5-desmethoxy- 1 5-methyl-FK-506 


FK-506 


methyl 


hydrogen 


13,1 5-didesmethoxy- 1 3-methyl-FK-506 


FK-506 


methyl 


methoxy 


13-desmethoxy-13-methyl-FK-506 


FK-506 


methyl 


methyl 


1 3, 1 5-didesmethoxy-l 3, 1 5-dimethyl-FK-506 


FK-520 


hydrogen 


hydrogen 


13, 1 5-didesmethoxy FK-520 
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FK-520 


hydrogen 


methoxy 


FK-520 


hydrogen 


methyl 


FK-520 


methoxy 


hydrogen 


FK-520 


methoxy 


methoxy 


FK-520 


methoxy 


methyl 


FK-520 


methyl 


hydrogen 


FK-520 


methyl 


methoxy 


FK-520 


methyl 


methyl 



13-desmethoxy FK-520 

1 3,1 5-didesmethoxy-l 5-methyl-FK-520 

1 5-desmethoxy-FK-520 

Original Compound - FK-520 

1 5-desmethoxy- 1 5-methyl-FK-520 

13,1 5-didesmethoxy- 1 3-methyl-FK-520 

1 3-desmethoxy-l 3-methyl-FK-520 

13,1 5-didesmethoxy- 13,1 5-dimethyl-FK-520 



Example 5 

Replacement of Methoxyl with Ethyl at C-13 and/or C-15 of FK-506 and FK-520 
The present invention also provides novel FK-506 and FK-520 derivative 
compounds in which the methoxy groups at either or both the C-13 and C-15 positions 
are instead ethyl groups. These compounds are produced by novel PKS enzymes of the 
invention in which the AT domains of modules 8 and/or 7 are converted to ethylmalonyl 
specific AT domains by modification of the PKS gene that encodes the module. 
Ethylmalonyl specific AT domain coding sequences can be obtained from, for example, 
the FK-520 PKS genes, the niddamycin PKS genes, and the tylosin PKS genes. The 
novel PKS genes of the invention include not only those in which either or both of the 
AT domains of modules 7 and 8 have been converted to ethylmalonyl specific AT 
domains but also those in which one of the modules is converted to an ethylmalonyl 
specific AT domain and the other is converted to a malonyl specific or a methylmalonyl 
specific AT domain. 



Example 6 
Neurotrophic Compounds 
The compounds described in Examples 1 - 4, inclusive have immunosuppressant 
activity and can be employed as immunosuppressants in a manner and in formulations 
similar to those employed for FK-506. The compounds of the invention are generally 
effective for the prevention of organ rejection in patients receiving organ transplants and 
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in particular can be used for immunosuppression following orthotopic liver 
transplantation. These compounds also have pharmacokinetic properties and metabolism 
that are more advantageous for certain applications relative to those of FK-506 or FK- 
520. These compounds are also neurotrophic; however, for use as neurotrophins, it is 
desirable to modify the compounds to diminish or abolish their immunosuppressant 
activity. This can be readily accomplished by hydroxylating the compounds at the C- 18 
position using established chemical methodology or novel FK-520 PKS genes provided 
by the present invention. 

Thus, in one aspect, the present invention provides a method for stimulating nerve 
growth that comprises administering a therapeutically effective dose of 1 8-hydroxy-FK- 
520. In another embodiment, the compound administered is a C-18,20-dihydroxy-FK-520 
derivative. In another embodiment, the compound administered is a C-13-desmethoxy 
and/or C-15-desmethoxy 1 8-hydroxy-FK-520 derivative. In another embodiment, the 
compound administered is a C-13-desmethoxy and/or C-15-desmethoxy 18,20- 
dihydroxy-FK-520 derivative. In other embodiments, the compounds are the 
corresponding analogs of FK-506. The 18-hydroxy compounds of the invention can be 
prepared chemically, as described in U.S. Patent No. 5,189,042, incorporated herein by 
reference, or by fermentation of a recombinant host cell provided by the present invention 
that expresses a recombinant PKS in which the module 5 DH domain has been deleted or 
rendered non-functional. 

The chemical methodology is as follows. A compound of the invention (-200 mg) 
is dissolved in 3 mL of dry methylene chloride and added to 45 uL of 2,6-lutidine, and 
the mixture stirred at room temperature. After 10 minutes, tert-butyldimethylsilyl 
trifluoromethanesulfonate (64 uL) is added by syringe. After 15 minutes, the reaction 
mixture is diluted with ethyl acetate, washed with saturated bicarbonate, washed with 
brine, and the organic phase dried over magnesium sulfate. Removal of solvent in vacuo 
and flash chromatography on silica gel (ethyl acetate: hexane (1 :2) plus 1% methanol) 
gives the protected compound, which is dissolved in 95% ethanol (2.2 mL) and to which 
is added 53 uL of pyridine, followed by selenium dioxide (58 mg). The flask is fitted 
with a water condenser and heated to 70°C on a mantle. After 20 hours, the mixture is 
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cooled to room temperature, filtered through diatomaceous earth, and the filtrate poured 
into a saturated sodium bicarbonate solution. This is extracted with ethyl acetate, and the 
organic phase is washed with brine and dried over magnesium sulfate. The solution is 
concentrated and purified by flash chromatography on silica gel (ethyl acetate: hexane 
(1:2) plus 1% methanol) to give the protected 18-hydroxy compound. This compound is 
dissolved in acetonitrile and treated with aqueous HF to remove the protecting groups. 
After dilution with ethyl acetate, the mixture is washed with saturated bicarbonate and 
brine, dried over magnesium sulfate, filtered, and evaporated to yield the 18-hydroxy 
compound. Thus, the present invention provides the C-18-hydroxyl derivatives of the 
compounds described in Examples 1-4. 

Those of skill in the art will recognize that other suitable chemical procedures can 
be used to prepare the novel 18-hydroxy compounds of the invention. See, e.g., Kawai et 
al, Jan. 1993, Structure-activity profiles of macrolactam immunosuppressant FK-506 
analogues, FEBS Letters 316(2): 107-1 13, incorporated herein by reference. These 
methods can be used to prepare both the C18-[S]-OH and C18-[7?]-OH enantiomers, with 
the R enantiomer showing a somewhat lower IC 50 , which may be preferred in some 
applications. See Kawai et ah, supra. Another preferred protocol is described in Umbreit 
and Sharpless, 1977, JACS 99(16): 1526-28, although it may be preferable to use 30 
equivalents each of Se0 2 and t-BuOOH rather than the 0.02 and 3-4 equivalents, 
respectively, described in that reference. 

All scientific and patent publications referenced herein are hereby incorporated by 
reference. The invention having now been described by way of written description and 
example, those of skill in the art will recognize that the invention can be practiced in a 
variety of embodiments, that the foregoing description and example is for purposes of 
illustration and not limitation of the following claims. 
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